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ABSTRACT 

A maximum entropy method (MEM) is presented for separating the emission due to 
different foreground components from simulated sateUite observations of the cosmic 
microwave background radiation (CMBR). In particular, the method is applied to 
simulated observations by the proposed Planck Surveyor satellite. The simulations, 
performed by Bouchet and Gispert (1998), include emission from the CMBR, the 
kinetic and thermal Sunyaev-Zel'dovich (SZ) effects from galaxy clusters, as well as 
Galactic dust, free-free and synchrotron emission. We find that the MEM technique 
performs well and produces faithful reconstructions of the main input components. The 
method is also compared with traditional Wiener filtering and is shown to produce 
consistently better results, particularly in the recovery of the thermal SZ effect. 
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1 INTRODUCTION 

The importance of making accurate measurements of the 

■ fluctuations in the CMBR is now widely appreciated. In- 

■ deed, by making maps of these fluctuations and by mea- 
suring their power spectrum, it is hoped that tight con- 
straints may be placed on fundamental cosmological param- 
eters and that we may distinguish between competing the- 

, ories of structure formation in the early Universe such as 

' inflation and topological defects. 

Several ground-based and balloon-borne experiments 
are planned over the next few years, and these should pro- 
vide accurate images of the CMBR fluctuations and lead to 
a significant improvement in the measurement of the CMBR 
power spectrum. Nevertheless, these experiments are un- 
likely to be able to achieve the accuracy required to resolve 
numerous degeneracies that exist in the parameter set of, 
for example, the standard inflationary CDM model. As a 
result, a new generation of CMBR satellites are currently in 
the flnal stages of design, and it is hoped that these exper- 
iments will provide definitive measurements of the CMBR 
power spectrum as well as detailed all-sky maps of the fiuc- 
tuations. 

According to current estimates, the NASA MAP satel- 
lite is due to be launched in 2000, followed by the ESA 
Planck Surveyor mission in 2005. Both experiments aim to 
make high-resolution, low-noise maps of the whole sky at 
several observing frequencies. As with any CMBR experi- 
ment, however, the maps produced will contain contribu- 
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tions from various foreground components. The main fore- 
ground components are expected to be Galactic dust, free- 
free and synchrotron emission as well as the kinetic and 
thermal SZ effects from galaxy clusters. In addition, signifi- 
cant contamination from extragalactic points sources is also 
likely. 

It is clear that in order to obtain maps of the CMBR 
fluctuations alone it is necessary to separate the emission 
due to these various components. The removal of point 
sources from the satellite observations is perhaps the most 
troublesome aspect of this separation, since our knowledge 
of the various populations of sources is incomplete. Never- 
theless, at observing frequencies in the range 10-100 GHz, 
we expect the point sources to be mainly radio- loud AGN, 
including flat-spectrum radiogalaxies and QSOs, blazars and 
possibly some inverted-spectrum radiosources. At higher ob- 
serving frequencies in the range 300-900 GHz, the dominant 
point sources should be infrared luminous galaxies, radio- 
quiet AGN and smaller numbers of high-redshift galaxies 
and QSOs. However, since the frequency spectra of many 
of these extragalactic objects are, in general, rather com- 
plicated, any extrapolation of their emission over a wide 
frequency interval must be performed with caution. 

For small flelds, a straightforward and effective tech- 
nique for removing point sources is to make high-resolution, 
high-flux-sensitivity observations of each fleld, at frequencies 
close to those of the CMBR experiment. The point sources 
can then be identified and accurately subtracted from the 
maps (O'SulIivan et al. 1995). For multifrequency all-sky 
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satellite observations, however, such a procedure is infea- 
sible. Nevertheless, for the Planck Surveyor, it is expected 
that a significant fraction of point sources may be identified 
and removed using the satellite observations themselves, to- 
gether perhaps with pre-existing surveys. Based on the es- 
timated sensitivity of the Planck Surveyor to point sources, 
De Zotti et al. (1997) find that it is straightforward, at each 
observing frequency independently, to subtract all sources 
brighter than 1 Jy and that it may be possible to subtract 
all sources brighter than 100 mjy at intermediate frequen- 
cies where the CMBR emission peaks. Careful modelling of 
the likely point source contamination also suggests that the 
number of pixels affected at each frequency should only be a 
small percentage of the total number. Moreover, De Zotti et 
al. find the level of fiuctuations due to unsubtracted sources 
to be very low. Similar conclusions follow from the model 
of Guiderdoni et al. (1997, 1998). Using simulated maps of 
point sources (Toffolatti et al. 1998), a full investigation of 
their efi'ects on Planck Surveyor observations will be pre- 
sented in a forthcoming paper (Hobson et al, in prepara- 
tion). 

Aside from extragalactic point sources, the other phys- 
ical components mentioned above have reasonably well de- 
fined spectral characteristics, and we may use this informa- 
tion, together with multifrequency observations, to distin- 
guish between the various foregrounds. Several linear meth- 
ods have been suggested to perform this separation, many 
of which are based on Wiener filtering (e.g. Bouchet, Gis- 
pert & Puget 1996; Tegmark & Efstathiou 1996; Bouchet et 
al. 1997). In this paper, however, we investigate the use of a 
non-linear maximum entropy method (MEM) for separating 
out the emission due to different physical components and 
compare its performance with the Wiener filter approach. 
We apply these methods to simulated observations from the 
Planck surveyor satellite but, of course, the same algorithms 
can be used to analyse data from the MAP satellite. The ap- 
plication of the MEM technique to simulated interferometer 
observations of the CMBR is discussed in Maisinger, Hob- 
son & Lasenby (1997) and the method has also been applied 
to the analysis of ground-based switched-beam observations 
by Jones et al. (1997). 



2 SIMULATED PLANCK SURVEYOR 
OBSERVATIONS 

In order to create simulated Planck Surveyor observations, 
we must first build a realistic model of the sky at each of the 
proposed observing frequencies. As mentioned above, dust, 
free-free and synchrotron emission from our own Galaxy, 
extragalactic radiosources and infrared galaxies, and the ki- 
netic and thermal SZ effect from clusters of galaxies all con- 
tribute to sky emission at least at some frequencies and an- 
gular resolutions of interest. We assume that point sources 
may be removed as described above, and that the residual 
background of unsubtracted sources is negligible. Thus the 
simulations presented here include emission from the three 
Galactic components, the two SZ effects and the primordial 
CMBR fluctuations. 

Simulated maps of these six components are con- 
structed on 10 X 10-degree flelds with 1.5 arcmin pixels; thus 
each map consists of 400x400 pixels. A detailed discussion of 



these simulations is given by Bouchet et al. (1997), Gispert & 
Bouchet (1997) and Bouchet & Gispert (1998). The primary 
CMBR fluctuations are a realisation of a COBE-normalised 
standard CDM model with critical density and a Hubble 
parameter Hq = 50 km s~^ Mpc^^ (using a program kindly 
provided by J.R. Bond). Realisations of the kinetic and ther- 
mal SZ effects are generated using the Press-Schechter for- 
malism, as discussed in Aghanim et al. (1997), which yields 
the number density of clusters per unit redshift, solid an- 
gle and flux density interval. The gas profiles of individual 
clusters are taken as King models, and their peculiar radial 
velocities are drawn at random from an assumed Gaussian 
velocity distribution with a standard deviation at z = of 
400 km s"^ 

For the Galactic dust and free-free emission, 100-/im 
IRAS maps are used as spatial templates. Comparison of 
dust, free-free and 21cm Hi emission suggests the existence 
of a spatial correlation between these components (Kogut et 
al. 1996; Boulanger et al. 1996). In order to take account of 
these correlations, the simulations assume the existence of 
an Hl-correlated component that accounts for 50 per cent 
of the free-free emission and 95 per cent of the dust emis- 
sion. The remaining free-free and dust emission is assumed 
to come from a second, Hl-uncorrelated component. For any 
particular simulation, a given 100-/im IRAS map is used as 
a spatial template for the Hl-correlated component and a 
contiguous map is used for the Hl-uncorrelated component. 
The dust spectral behaviour is modelled as a single temper- 
ature component at 18 K with dust emissivity oc the 
rms level of fiuctuations at any given frequency is scaled 
accordingly from the 100-/im IRAS map. The IRAS map 
used here has an rms level approximately equal to the me- 
dian level for such maps, i.e. about one-half of IRAS 100-/.im 
maps of this size have a lower rms, and half have a higher 
rms when scales between 1 and 3 degrees are included (see 
Bouchet et al. 1996 for details). The free- free intensity is 
assumed to vary as 7 cx v~^'^^ , and is normalised to give an 
rms temperature fiuctuation of 6.2 /iK at 53 GHz. 

No spatial template is available for the synchrotron 
emission at a sufficiently high angular resolution, so the sim- 
ulations of this component are performed using the 408 MHz 
radio maps of Haslam et al. (1982), which have a resolu- 
tion of 0.85 degrees, and adding to them (Gaussian) small 
scale structure that follows a oc power spectrum. The 
synchrotron intensity is assumed vary as / oc v~^'^ and its 
normalisation taken directly from the 408 MHz maps. 

For primary CMBR fluctuations it is usual to work in 
terms of temperature rather than intensity. A temperature 
difference on the sky ATcmb(A) leads to a fluctuation in the 
intensity given by 



A/cmb(*,i^) 



dB{v, T) 



dT 



T=To 



where B{v,T) is the Planck function and To = 2.726 K is 
the mean temperature of the CMBR (Mather et al. 1994). 
The conversion factor can be approximated by 



dB{u, T) 



dT 



24.8 



T=To 



sinh(3;/2) 



Jy sr-i (mK)-i, 



where x « i^/56.8 GHz. In order to compare the relative 
level of fluctuations in each physical component we shall 



© 1998 RAS, MNRAS 000, 



Foreground separation methods 3 



I 



1 

ij 






^^-^ 1 


— « h — 

1 1 


H =1 

INK. 

1 - 





s Cd) 





Cf) 




— I — I — h 




■3: 



Figure 1. The 10 X 10-degree realisations of the six input components used to make simulated Planck Surveyor observations; (a) primary 
CMBR fluctuations; (b) kinetic SZ effect; (c) thermal SZ effect; (d) Galactic dust; (e) Galactic free-free; (f) Galactic synchrotron emission. 
Each component is plotted at 300 GHz and has been convolved with a Gaussian beam of FWHM equal to 4.5 arcmin, the maximum 
angular resolution proposed for the Planck Surveyor. The map units are equivalent thermodynamic temperature in /jK. 
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adopt the convention of Tegmark & Efstathiou (1996) and 
also define the equivalent thermodynamic temperature fluc- 
tuation for the other components by 



AIp{x, v) 
dB{v,To)/dT' 



where p denotes the relevant physical foreground compo- 
nent. We note that, in general, the 'temperature' fluctua- 
tions of these other components will be frequency dependent, 
unlike those of the CMBR. For the remainder of this paper, 
fluctuations will be quoted in temperature units measured 
in fiK. 

The realisations of the six input components used to 
make simulated observations are shown in Fig. ^. Each com- 
ponent is plotted at 300 GHz and, for illustration purposes, 
has been convolved with a Gaussian beam of FWHM equal 
to 4.5 arcmin, which is the highest angular resolution pro- 
posed for the Planck Surveyor. For convenience, we have also 
set the mean of each map to zero, in order to highlight the 
relative level of fluctuations due to each component. 

From Fig. |l]we see that, as expected, the emission due 
to primordial CMBR fluctuations appears Gaussian in na- 
ture. This is, of course, a direct consequence of using a stan- 
dard inflationary CDM model to create this realisation. If 
the CMBR realisation were instead created assuming an al- 
ternative theory of structure formation such as topological 
defects, for example, then the CMBR fluctuations are not 
required to be Gaussian, but may exhibit sharp edges or 
highly non-Gaussian localised hot spots (Bouchet, Bennett 
& Stebbins 1988, Turok 1996). The emission due to the ki- 
netic and thermal SZ effects is clearly highly non-Gaussian, 
being dominated by resolved and unresolved clusters that 
appears as sharp peaks of emission. As we would expect, al- 
though an obvious correlation exists between the positions 
of the kinetic and thermal SZ effects, the signs and mag- 
nitudes of the kinetic effect are not correlated with those 
of the thermal effect. We also note that the IRAS 100-fim 
maps used as templates for the Galactic dust and free-free 
emission also appear quite non-Gaussian; the imposed corre- 
lation between the dust and free-free emission is also clearly 
seen. Finally, the synchrotron emission seems quite Gaus- 
sian, although this appearance is due mainly to the addi- 
tion to the Haslam 408 MHz map of Gaussian small scale 
structure, following a. Ci oc £~'' power law, on angular scales 
below 0.85 degrees. 

The azimuthally-averaged power spectra of the input 
maps are shown in Fig. |2| At lower multipoles, all three 
Galactic components have power spectra which vary roughly 



as Ce oc 



(for the synchrotron component small scale 



structure with this power spectrum was added artiflcial for 
£ ^ 250). For the kinetic and thermal SZ effects, however, 
the power spectra are quite different and are better approx- 
imated by a white-noise power spectrum C'e oc constant, as 
expected for Poisson-distributed processes. 

Using the realisations for each physical component 
shown in Fig. |l|, it is straightforward to simulate Planck Sur- 
veyor observations. The satellite is made up of two mains 
parts: the Low Frequency Instrument (LFI), which uses 
HEMT radio receivers, and the High Frequency Instrument 
(HFI), which contains bolometer arrays. Since the final de- 
sign of the satellite is still undecided, the precise values of 
observational parameters for the LFI and HFI are subject 
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Figure 2. The azimuthally-averaged power spectra of the input 
maps at 300 GHz. 



to revision. Nevertheless, recent proposed changes to both 
instruments may significantly improve the sensitivity of the 
satellite, as compared to the design outlined in the ESA 
phase A study (Bersanelli et al. 1996). Therefore, although 
these modifications are not yet finalised, we have incorpo- 
rated the latest design specifications into our simulations. 
The parameters used in making the simulated observations 
are given in Table ^. 

The simulated observations are produced by integrating 
the emission due to each physical component across each 
waveband, assuming the transmission is uniform across the 
band. At each observing frequency, the total sky emission is 
convolved with a Gaussian beam of the appropriate FWHM. 
Finally, isotropic noise is added to the maps, assuming a 
spatial sampling rate of FWHM/2.4 at each frequency (thus 
the noise rms of the maps is about 2.4 times higher than 
the instrumental sensitivity per FWHM quoted in Table 
We note, however, that the assumption of isotropic noise 
is not required by the separation algorithms discussed in 
Section |^ We have also assumed that any striping due to 
the scanning strategy and 1/f noise has been removed to 
sufficient accuracy that any residuals are negligible. 

Fig. ^ shows the rms temperature fluctuations at each 
observing frequency due to each physical component, after 
convolution with the appropriate beam. The rms noise per 
pixel at each frequency channel is also plotted. We see from 
the figure that, as expected, the rms temperature fiuctua- 
tion of the CMBR is almost constant across the frequency 
channels; the only variation being due to the convolution 
with beams of different sizes. Furthermore, for all channels 
up to 217 GHz, the CMBR signal is several times the level of 
the instrumental noise even for a pixelisation at FWHM/2.4 
(which boosts by 2.4 the noise level per FWHM). At higher 
frequencies, the noise level exceeds the CMBR signal but is 
itself dominated by Galactic dust emission. We also see a 
sharp dip in the rms level of the thermal SZ effect at 217 
GHz, since the emission from this component is close to zero 
at this frequency. At any given frequency, the rms level of 
the thermal SZ effect is at least an order of magnitude be- 
low that of the dominant component. The kinetic SZ effect 
has the same spectral characteristics as the CMBR, but the 
effect of convolution with beams of different sizes has a sig- 
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Table 1. Proposed observational parameters for the Planck Surveyor satellite (Efstathiou, private communication). Angular resolution 
is quoted as FWHM for a Gaussian beam. Sensitivities are quoted per FWHM for 12 months of observation. 



Low Frequency Instrument High Frequency Instrument 



Central frequency (GHz): 


30 


44 


70 


100 


100 


143 


217 


353 


545 


857 


Fractional bandwidth (Au/u): 


0.2 


0.2 


0.2 


0.2 


0.37 


0.37 


0.37 


0.37 


0.37 


0.37 


Transmission: 


1.0 


1.0 


1.0 


1.0 


0.3 


0.3 


0.3 


0.3 


0.3 


0.3 


Angular resolution (arcmin): 


33 


23 


14 


10 


10.6 


7.4 


4.9 


4.5 


4.5 


4.5 


AT sensitivity (/iK): 


4.4 


6.5 


9.8 


11.7 
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12.5 


40.9 


392 
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Figure 3. The rms thermodynamic temperature fluctuations at 
each Planck Surveyor observing frequency due to each physical 
component, after convolution with the appropriate beam and us- 
ing a sampling rate of FWHM/2.4. The rms noise per pixel at 
each frequency channel is also plotted. 



nificant effect on the point-like emission and leads to a more 
pronounced variation in the observed rms level than for the 
CMBR (since then most of the power is at small scales). 
The observed rms level of the kinetic SZ is at least two or- 
ders of magnitude below the dominant component at any 
given frequency. In a similar manner, the Galactic free-free 
and synchrotron emission are also completely dominated by 
either CMBR or dust emission at all observing frequencies. 

The observed maps at each of the ten Planck Surveyor 
frequencies are shown in Fig. ^ in units of equivalent ther- 
modynamic temperature measured in jjK. The coarser pix- 
elisation at the lower observing frequencies is due to the 
FWHM/2.4 sampling rate. Moreover, at these lower frequen- 
cies, the effect of convolution with the relatively large beam 
is also easily seen. As the observing frequency increases, 
the beam size becomes smaller, leading to a corresponding 
increase in the sampling rate. Consequently, the observed 
maps more closely resemble the input map of the dominant 
physical component at each frequency. As may have been 
anticipated from Fig. ^, the emission in the lowest seven 
channels is dominated by the CMBR, whereas dust emission 
dominates in the highest three channels. Indeed, the main 
reason for the inclusion of the highest frequency channels is 
to obtain an accurate dust model, in order that it may be 
subtracted from lower frequency channels with some confi- 
dence. Perhaps the most notable feature of the ten channels 
maps is that, at least by eye, it is not possible to discern 



features due to physical components other than the CMBR 
or dust. 



3 COMPONENT SEPARATION METHODS 

As a first step in discussing component separation meth- 
ods, let us consider in more detail how the simulated data 
are made. At any given frequency v the total rms temper- 
ature fluctuation on the sky in a direction x is given by 
the superposition of Uc physical components (tIc = 6 in our 
simulations). It is convenient to factorise the contribution 
of each process into a spatial template Sp{x) at a reference 
frequency and a frequency dependence ,fp{v), so that 



AT(x,i/) 



p=i 



ATp(x,i/) 



In this paper we take the reference frequency — 300 
GHz and normalise the frequency dependencies such that 
fp{iyo) = 1. 

If we observe the sky at n/ observing frequencies then, 
in any given direction x, we obtain a n/-component data 
vector that contains the observed temperature fluctuation in 
this direction at each observing frequency plus instrumental 
noise. In order to relate this data vector to the emission from 
each physical component it is useful to introduce the Uf xric 
frequency response matrix with components defined by 



up 



ti,{v')fp{v') Au' 



(1) 



where tv{v') is the frequency response (or transmission) of 
the vth frequency channel. Assuming that the satellite ob- 
serving beam in each channel is spatially invariant, we may 
write the beam-smoothing as a convolution and, in discre- 
tised form, the i^th component of the data vector in the 
direction x is then given by 

di^ix) = ^ P^{\x - Xj\) ^ F^p Sp{xj) + e, 



,(*) 



(2) 



p=i 



where Pu{x) is the beam profile for the vth frequency chan- 
nel, and the index j labels the Np pixels in each of the simu- 
lated input maps shown in Fig. HI; the ei, (x) term represents 
the instrumental noise in the vth channel in the direction x. 

In each channel the beam profile is assumed spatially 
invariant and the noise statistically homogeneous (which are 
both reasonable assumptions for small fields), and it is more 
convenient to work in Fourier space, since the convolution 
in (Id) becomes a simple multiplication and we obtain 
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Figure 4. The 10 X 10-degree maps observed at each of the ten Planck Surveyor frequencies Usted in Table Q At each frequency we 
assume a Gaussian beam with the appropriate FWHM and a sampling rate of FWHM/2.4. Isotropic noise with the relevant rms has 
been added to each map. The map units are equivalent thermodynamic temperature in fiK. 
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d^(k) = ^ R^p{k)sp{k) + e„(fe) 



(3) 



where R^p{k) = Pv{k)Fvp are the components of the re- 
sponse matrix for the observations. It is important to note 
that ^ is satisfied at each Fourier mode fc independently. 
Thus, in matrix notation, at each mode we have 



Rs + e 



(4) 



where d, s and e are column vectors containing nf,nc and n/ 
complex components respectively, and the response matrix R 
has dimensions Uf x ric. Although the column vectors in 
refer to quantities defined in the Fourier domain, it should 
be noted that for later convenience they are not written with 
a tilde. 

The significant simplification that results from working 
in the Fourier domain is clear, since the dimensions of the 
matrices in are rather small {ric = 6 and n/ = 10 in 
our simulations). Thus, the situation reduces to the solving 
a small-scale linear inversion problem at each Fourier mode 
separately. Once this inversion has been performed for all the 
measured modes, the spatial templates for the sky emission 
due to each physical component at the reference frequency 
uo axe then obtained by an inverse Fourier transformation. 
Owing to the presence of instrumental noise, however, it is 
clear that the inverse, R~^, of the response matrix at each 
Fourier mode does not exist and that the linear inversion 
problem in each case is degenerate. The approximate inver- 
sion of must therefore be performed using a statistical 
technique in which the inversion is regularised in some way. 
This naturally leads us to consider a Bayesian approach. 



frequency, the likelihood is therefore given by 

Pr(d|s) oc exp(-etN~^e) 

cx exp [-(d- Rs)tN"^(d- Rs)] , (6) 

where the dagger denotes the Hermitian conjugate and in 
the second line we have used (^. We note that no factor of 
1/2 appears in the exponent in 1^) since it refers to the mul- 
tivariate Gaussian distribution of a set of complex random 
variables. The noise covariance matrix N has dimensions 
rif X rif and at any given Fourier mode k it is defined by 



N(fc) = (e(fe)et(fc)). 



(7) 



i.e. its elements are given by N^^i (k) = ( ey(fe) e *, (fe)) , where 
the asterisk denotes complex conjugation. Thus, at a given 
Fourier mode, the uih diagonal element of N contains the 
value at that mode of the ensemble-averaged power spectra 
of the instrumental noise on the uih frequency channel. If the 
noise is uncorrelated between channels then the off-diagonal 
elements are zero for all k. 

We note that the expression in square brackets in 
is simply the misfit statistic. Since, for a given set of 
observations, the data vector d, the response matrix R and 
the noise covariance matrix N are all fixed, we may consider 
the misfit statistic as a function only of the signal vector s, 



x'(s) = (d-Rs)tN-i(d-Rs), 

so that the likelihood can be written as 

Pr(d|s) oc exp[--x^(s)]- 



(8) 



(9) 



Having calculated the form of the likelihood we must now 
turn our attention to the form of the prior probability Pr(s). 



3.1 Bayes' theorem 

Bayes' theorem states that, given a hypothesis H and some 
data D the posterior probability Pr{H\D) is the product 
of the likelihood Pt{D\H) and the prior probability Pt{H), 
normalised by the evidence Pr{D) , 

In our application, we consider each Fourier mode k 
separately and, from (^) , we see that the data consist of the 
Uf complex numbers in the data vector d, and we take the 
'hypothesis' to consist of the ric complex numbers in the 
signal vector s. We then choose as our estimator s of the 
signal vector that which maximises the posterior probabil- 
ity Pr(sjd). Since the evidence in Bayes' theorem is merely 
a normalisation constant we must therefore maximise with 
respect to s the quantity 

Pr(sid) oc Pr(d|s)Pr(s) (5) 

which is the product of the likelihood Pr(djs) and the prior 
Pr(s). 

Let us first consider the form of the likelihood. If the 
instrumental noise on each frequency channel is Gaussian- 
distributed, then at each Fourier mode the probability dis- 
tribution of the n/-component noise vector e is described 
by an n/- dimensional multivariate Gaussian. Assuming the 
expectation value of the noise to be zero at each observing 



3.2 The Gaussian prior 

If we assume that the emission due to each of the physical 
components shown in Fig. ^ is well approximated by a Gaus- 
sian random field, then it is straightforward to derive an 
appropriate form for the prior Pr(s). In this case, the proba- 
bility distribution of the sky emission is described by a mul- 
tivariate Gaussian distribution, characterised by a given sky 
covariance matrix. Thus, at each mode k in Fourier space, 
the probability distribution of the signal vector s is also de- 
scribed by a multivariate Gaussian of dimension Uc, where 
ric is the number of distinct physical components (ric = 6 in 
our simulations). The prior therefore has the form 

Pr(s) oc exp (-s^C'^s) , (10) 

where the signal covariance matrix C is real with dimensions 
ric X ric and is given by 

C(fe) = {s(fe)st(fe)), (11) 

i.e. it has elements Cppi(k) — {sp{k) s*,{k)). Thus, at each 
Fourier mode, the pth diagonal element of C contains the 
value of the ensemble-averaged power spectrum of the pth 
physical component at the reference frequency uo', the off- 
diagonal terms describe cross-power spectra between the 
components. 

Strictly speaking, the use of this prior requires advance 
knowledge of the full covariance structure of the processes 
that we are trying to reconstruct. Nevertheless, it is antici- 
pated that some information concerning the power spectra 
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of the various components, and correlations between them, 
will be available either from pre-existing observations or by 
performing an initial approximate separation using, for ex- 
ample, the singular value decomposition (SVD) algorithm 
(see Bouchet et al. 1997, Bouchet & Gispert 1998). A dis- 
cussion of the SVD solution, in the context of Bayes' theo- 
rem, is given in Appendix C. This information can then be 
used to construct an approximate signal covariance matrix 
for use in Pr(s). 

Substituting d) and (0) into (|), the posterior proba- 
bility is then given by 



Pr(s|d) oc exp 



-X'(s) - 



(12) 



where X^(s) is given by (^. Completing the square for s in 
the exponential (see Zaroubi et al. 1995), it is straightfor- 
ward to show that the posterior probability is also a multi- 
variate Gaussian of the form 

Pr(sid) oc exp [-(s - s)'''E"^(s - s)] . (13) 

which has its maximum value at the estimate s of the signal 
vector and where E is the covariance matrix of the recon- 
struction errors. 

The estimate s of the signal vector is found to be 

s = (C"^ + R+N^^R)^ R^N^^d = Wd, (14) 

where we have identified the Wiener matrix W. Thus, we 
find that by assuming a Gaussian prior of the form ( [lo[ ) 
in Bayes' theorem, we recover the standard Wiener filter. 
This optimal linear filter is usually derived by choosing the 
elements of W such that they minimise the variances of the 
resulting reconstruction errors. From ( |l^ we see that at a 
given Fourier mode, we may calculate the estimator s that 
maximises the posterior probability simply by multiplying 
the data vector d by the Wiener matrix W. Equation (14) 
can also be derived straightforwardly by difi'erentiating (12) 
with respect s and equating the result to zero (see Appendix 
A). 

As is well-known, the assignment of errors on the 
Wiener filter reconstruction is straightforward and the co- 
variance matrix of the reconstruction errors E in (|l^ ) is given 

by 

E ((s - s)(s - s)t) = (C-^ -I- R^N-'r)"' (15) 

Since the posterior probability (^) is Gaussian, this matrix 
is simply the inverse Hessian or curvature matrix of (minus) 
the exponent in (|l^), evaluated at s (see Appendix A). 

It should be noted that the linear nature of the Wiener 
filter and the simple propagation of errors are both direct 
consequences of assuming that the spatial templates we wish 
to reconstruct are well-described by Gaussian random fields 
with a known covariance structure. Several applications are 
given in Bouchet et al. (1997). 



3.3 The entropic prior 

It is clear from Fig. |l| that the emission due to several of 
the underlying physical processes is far from Gaussian. This 
is particularly pronounced for the kinetic and thermal SZ 
effects, but the Galactic dust and free-free emissions also 
appear quite non-Gaussian. Ideally, one might like to as- 
sign priors for the various physical components by measur- 
ing empirically the probability distribution of temperature 



fiuctuations from numerous realisations of each component. 
This is not feasible in practice, however, and instead we con- 
sider here the use of the entropic prior, which is based on 
information-theoretic considerations alone. 

Let us consider a discretised image hj consisting of L 
cells, so that j = 1, . . . ,L; we may consider the hj as the 
components of an image vector h. Using very general notions 
of subset independence, coordinate invariance and system 
independence, it may be shown (Skilling 1989) that the prior 
probability assigned to the values of the components in this 
vector should take form 

Pr(h) cx exp[QS'(h, m)], 

where the dimensional constant a depends on the scaling 
of the problem and may be considered as a regularising pa- 
rameter, and m is a model vector to which h defaults in the 
absence of any data. The function ^(h, m) is the cross en- 
tropy of h and m. In standard applications of the maximum 
entropy method, the image h is taken to be a positive addi- 
tive distribution (PAD). Nevertheless, the MEM approach 
can be extended to images that take both positive and neg- 
ative values by considering them to be the difference of two 
PADS, so that 



where u and v are the positive and negative parts of h re- 
spectively. In this case, the cross entropy is given by (Gull 
& Skilling 1990; Hobson & Lasenby 1998) 



J = l 



hj In 



^j + hj 



2mu 



.(16) 



where ipj = [h'^ +4mujm^j]^''^ and m„ and m„ are separate 
models for each PAD. The global maximum of the cross 
entropy occurs at h = m„ — m„ . 

In our application, we might initially suppose that at 
each Fourier mode we should take the 'image' to be the 
components of the signal vector s. However, this results in 
two additional complications. First, the components of sig- 
nal vector are, in general, complex, but the cross entropy 
given in ( |l6[ ) is defined only if the image h is real. Neverthe- 
less, the MEM technique can be straightforwardly extended 
to the reconstruction of a complex image by making a slight 
modification to the above discussion. If the image h is com- 
plex, then models m„ and m„ are also taken to be complex. 
In this case, the real and imaginary parts of rriu are the mod- 
els for the positive portions of the real and imaginary parts 
of h respectively. Similarly, the real and imaginary parts of 
m„ are the models for the negative portions of the real and 
imaginary parts of the image. The total cross entropy is then 
obtained by evaluating the sum (|l^ using first the real parts 
and then the imaginary parts of h, m„ and rtii,, and adding 
the results. Thus the total cross entropy for the complex 
image h is given by 



S(5R(h), K(m„), K(m,)) + 5(3(h), 3(m„), 3(m,)), 



(17) 



where K and 9 denote the real and imaginary parts of 
each vector. For simplicity we denote the sum ( p^ ) by 
S'c(h, rtiu, rrii,) where the subscript c indicates that it is the 
entropy of a complex image. 

The second complication mentioned above is more sub- 
tle and results from the fact that one of the fundamental 
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axioms of the MEM is that it should not itself introduce 
correlations between individual elements of the image. How- 
ever, as discussed in previous subsection, the elements of the 
signal vector s at each Fourier mode may well be correlated, 
this correlation being described by the signal covariance ma- 
trix C defined in Moreover, if prior information is avail- 
able concerning these correlations, we would wish to include 
it in our analysis. We are therefore lead to consider the in- 
troduction of an intrinsic correlation function (ICF) into the 
MEM framework (Gull & Skilling 1990). 

The inclusion of an ICF is most easily achieved by as- 
suming that, at each Fourier mode, the 'image' does not 
consist of the components of the signal vector s, but that 
instead h consists of the components of a vector of hidden 
variables that are related to the signal vector by 



s = Lh, 



(18) 



The Uc X Tic lower triangular matrix L in (y_8|) is that ob- 
tained by performing a Cholesky decomposition of the signal 
covariance matrix, i.e. C = LL^. We note that since C is real 
then so is L. Thus, if the components of h are apriori un- 
correlated (thereby satisfying the MEM axiom) and of unit 
variance, so that ( hp h*, ) = Sppi , we find that, as required, 
the a priori covariance structure of the signal vector is given 
by 



(sst) 



(LhhtL^) 



L(hht)L^ = LL^ = C. 



Moreover, using this construction the expected rms level for 
the real or imaginary part of each element of h is simply 
equal to l/-\/2. Therefore, at each Fourier mode, we as- 
sign the real and imaginary parts of every component in 
the model vectors rtiu and to be equal to m = l/-\/2. 

Substituting (^^ into (^), can also be written in 
terms of h and is given by 



X^(h) = (d - RLh)tN"^(d RLh) 



(19) 



Thus, using an entropic prior, the posterior probability be- 
comes 

Pr(h|d) oc exp [-x'(h) + «Se(h, m„, m„)] . (20) 
where the cross entropy Sc(h, mu,mv) is given by ( p^ ) and 



3.4 Maximising the posterior probability 



As discussed in Section 3.1, we choose our estimate s of the 



signal vector at each Fourier mode, as that which maximises 
the posterior probability Pr(sjd) with respect to s. 

For the Gaussian prior, we found in subsection that 
the posterior probability is also a Gaussian and that the 
estimate s is given directly by the linear relation (|l^). Nev- 
ertheless, we also note that, in terms of h defined in (|l8|), 
the quadratic form in the exponent of the Gaussian prior 
( p^ has the particularly simple form 

s^C-^s = htL^(LL^)^Lh = htL^(L^)^L^Lh = h^h, 

i.e. it is equal to the inner product of h with itself. Thus, us- 
ing a Gaussian prior, the posterior probability can be written 
in terms of h as 



Pr(h|d) oc exp [-x^(h) - h^h] 



where (h) is given by (^9|) Therefore, in addition to using 
the linear relation (^^, the Wiener filter estimate s can also 
be found by first minimising the function 



<E'wF(h) =x'(h) + hth. 



(22) 



to obtain the estimate h of the corresponding hidden vector, 
and then using (|l^ to give s = Lh. 

We have developed an algorithm (which will be pre- 
sented in a forthcoming paper) for minimising the function 
$wF with respect to h. Indeed, this algorithm calculates 
the reconstruction h in slightly less CPU time than the ma- 
trix inversions and multiplications required to evaluate the 
linear relation (^^. The minimiser requires only the first 
derivatives of the function and these are given in Appendix 
A. 

Let us now consider the MEM solution. From (^), we 
see that maximising the posterior probability when assum- 
ing an entropic prior is equivalent to minimising the function 



$MEM(h) = X (h) - ^^(h, rtiu, m„) 



(23) 



The minimisation of this 2nc-dimensional functions may 
also performed using the minimisation algorithm mentioned 
above, and the required first derivatives in this case are also 
given in Appendix A. 

It is important to note that, since we are using the same 
minimiser to obtain both the Wiener filter (WF) and MEM 
reconstructions, and the evaluation of each function and its 
derivatives requires similar amounts of computation, the two 
methods require approximately the same CPU time. Thus, 
at least in this application, any criticism of MEM that is 
based on its greater computational complexity, as compared 
to the WF, is no longer valid. For both the WF and the 
MEM, the reconstruction of the six 400 x 400 maps of the 
input components requires about two minutes on a Sparc 
Ultra workstation. 



3.5 The small fluctuation limit 

Despite the formal differences between ( |22[ ) and (p3[), the 
WF and MEM approaches are closely related. Indeed the 
WF can be viewed as a quadratic approximation to MEM, 
and is commonly referred to as such in the literature. This 
approximation is most easily verified by considering the 
small fluctuation limit, in which the real and imaginary parts 
of h are small compared to the corresponding models. 



Following the discussion at the end of Section 3.3, we 



begin by setting the real and imaginary parts of all the com- 
ponents of the models vectors m„ and m„ equal to m. Then, 
expanding the sum in ( p^ ) as a power series in hj and us- 
ing (|r^, we find that for small hj the total cross entropy is 
approximated by 



5'c(h,m„,tn„) ~ - ^ 



^{h,f + Q{h,f ^ _hth 
4m 4m ' 



Thus, in the small fiuctuation limit, the posterior probability 
assuming an entropic prior becomes Gaussian and is given 



Pr(h|d) oc exp 



-X'(h)-a 



h^h 

4m 



(25) 



(21) 



In fact, this approximation is reasonably accurate provided 
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the magnitudes of the real and imaginary parts of each el- 
ement of h are less than about 3m. Since m is set equal to 
the expected rms level of these parameters, we would there- 
fore expect that for a Gaussian process this approximation 
should remain valid. In this case, the posterior probability 
( p5[ ) becomes identical to that for the WF solution, provided 
we set a = 4m. 

We note, however, that for highly non-Gaussian pro- 
cesses, the magnitudes of the real and imaginary parts of 
the elements of h can easily exceed 3m and in this case the 
shapes of the posterior probability for the WF and MEM 
approaches become increasingly different. 



3.6 The regularisation constant a 

A common criticism of MEM has been the arbitrary choice of 
regularisation constant a, which is often considered merely 
as a Lagrange multiplier. In early applications of MEM, a 
was chosen so that the misfit statistic equalled its expec- 
tation value, i.e. the number of data points to be fitted. This 
choice is usually referred to as historic MEM. 

In the reconstruction of Fourier modes presented here, 
the situation is eased somewhat since the choice a = 4m is at 
least guaranteed to reproduce the results of the Wiener filter 
when applied to Gaussian processes. In fact, when applied to 
the simulations presented in Section ^, this choice of a does 
indeed bring into its expected statistical range nf±y/2nf, 
where n/ is the number of (complex) values in the data 
vector d at each Fourier mode. 

Nevertheless, it is possible to determine the appropriate 
value for a in a fully Bayesian maimer (Skilling 1989; Gull 
& Skilling 1990) by simply treating it as another parameter 
in our hypothesis space. It may be shown (see Appendix B) 
that a must be a solution of 



(26) 



where h is the hidden vector that maximises the posterior 
probability for this value of a. The Uc x ric matrix M is given 

by 



M 



VI MmEMU , 



where Hmem is the Hessian matrix of the function $mem at 
the point h and G is the metric on image-space at this point. 

It should be noted that both the reconstruction h and 
the matrix M depend on a and so (|2^) must be solved nu- 
merically using an iterative technique such as linear inter- 
polation or the Newton-Raphson method. We take a = 4m 
as our initial estimate in order to coincide with the Wiener 
filter in the small fluctuation limit. For any particular value 
of a, the corresponding reconstruction h(a) is obtained by 
minimising ^mem as given in (^^, and the Hessian of 
the posterior probability at this point is then calculated 
(see Appendix A). This in turn allows the evaluation of 
5'c(h,mu,m„) and Tr(M~^) respectively. Typically, fewer 
than ten iterations are needed in order to converge on a 
solution a that satisfles {2m. 



3.7 Updating the ICF and models 

In the MEM approach, after the Bayesian value a for the 
regularisation constant has been found, the corresponding 



posterior probability distribution is maximised to obtain the 
reconstruction h(a), from which the estimate of the signal 
vector s may be straightforwardly derived. Once this has 
been performed for each Fourier mode, the reconstruction 
of the sky emission due to each physical component is then 
found by performing an inverse Fourier transform. 

We could, of course, end our analysis at this point and 
use the maps obtained as our flnal reconstructions. How- 
ever, we find that the results can be further improved by 
using the current reconstruction to update the IGF matrix 
L and the models mu and rtii, , and then repeating the entire 
MEM analysis discussed above. At each Fourier mode, the 
updated models are taken directly from the current recon- 
struction and the updated IGF matrix is obtained by cal- 
culating a new signal covariance matrix C from the current 
reconstruction and performing a Gholesky decomposition. 
These quantities are then used in the next iteration of the 
MEM and the process is repeated until it converges on a 
final reconstruction. Usually, fewer than ten such iterations 
are required in order to achieve convergence. 

We might expect that a similar method may be used in 
the WF case, by repeatedly calculating an updated signal 
covariance matrix from the current reconstruction and using 
it in the subsequent iteration of the WF analysis. It is well- 
known, however, that, since the WF tends to suppress power 
at higher Fourier modes, the solution gradually tends to zero 
as more iterations are performed. One would thus first have 
to correct the derived component power spectra in order to 
obtain an unbiased estimator of the real spectrum (since 
one knows by how much power has been suppressed, see the 
discussion in 5.1) before performing the next iteration. This 
could somewhat improve the determination of the dominant 
processes (but in that case the exact input spectra is of little 
impact) but it would not help with the spectrum determi- 
nation of the weak processes (since WF essentially sets their 
power spectra level at the input level). In fact WF should 
rather be thought of as the last 'polishing' step of a com- 
ponent separation, once a first determination of the power 
spectra has been achieved by other means (e.g. by singular 
value decomposition). Bouchet & Gispert (1998) have as- 
sessed the reachable accuracy level for the Planck Surveyor 
in that case. In the following, we shall restrict our discussion 
to the two extreme cases of exact prior knowledge of the co- 
variance matrix or the prior knowledge of the rms levels only 
(the power spectra being all assumed to be white noise). Of 
course, WF makes much more sense if the assumed prior is 
not far from the truth, since it is designed to take advantage 
of that information. 



3.8 Estimating errors on the reconstruction 

Once the final reconstruction has been obtained, it is im- 
portant to be able to characterise the errors associated with 
it. In the case of the Wiener filter, the reconstructed signal 
vector s at each Fourier mode may be obtained in a linear 
manner from the observed data vector using (^^. Thus the 
propagation of errors is straightforward and the covariance 
matrix of the reconstruction errors at each Fourier mode is 
given by (^. 



As mentioned in Section 



i.2, however, this simple prop- 



agation of errors is entirely a result of the assumption of 
a Gaussian prior, which, together with the assumption of 
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Gaussian noise, leads to a Gaussian posterior probability 
distribution. In terms of the vector of hidden variables h the 
posterior probability for the WF is given by 

Pr(hid) oc exp [-$wp(h)] = exp [-(h - h)'''HwF(h - h)] , 

where the Hessian matrix Hwf is given by Hwf = 
VhVh*$WF evaluated at the peak h of the distribution, and 
the function "I>wf is given by (p^). Thus, the covariance ma- 
trix of the errors on the reconstructed hidden vector is then 
given exactly by the inverse of this matrix, i.e 

((h-h)(h-h)t) = HwF. 

From (|l^), the error covariance matrix for the reconstructed 
signal vector is then given by 



{(s - s)(s - s)t) = (L(h - h)(h - h)tL^) = LH^W 



(27) 



Using the expression for the Hessian matrix given in (A7), 
and remembering that s = Lh and C = LL""", the expression 
( p?! ) is easily shown to be identical to the result (p^. 

For the entropic prior, the posterior probability distri- 
bution is not strictly Gaussian in shape. Nevertheless, we 
may still approximate the shape of this distribution by a 
Gaussian at its maximum and, recalling the discussion of 
subsection we might expect this approximation to be 
reasonably accurate, particularly in the reconstruction of 
Gaussian processes. Thus, near the point h, we make the 
approximation 



Pr(h|d) oc exp [-$MEM(h)] « exp 



(h h)TH^ 



i(h-h) 



where Hmem ~ Vh Vh» 'I'mem evaluated at h, and $mem is 
given by (|2^). The covariance matrix of the errors on the 
reconstructed hidden vector is then given approximately by 
the inverse of this matrix, and so 

((s-s)(s-s)t) ^LHmWl"^. 

In both the WF and MEM cases, the reconstructed 
maps of the sky emission due to each physical component 
is obtained by inverse Fourier transformation of the signal 
vectors at each Fourier mode. Since this operation is linear, 
the errors on these maps may therefore be deduced straight- 
forwardly from the above error covariance matrices. 



4 APPLICATION TO SIMULATED 
OBSERVATIONS 

We now apply the MEM and WF analyses outlined above to 
the simulated Planck Surveyor data discussed in Section ^ 
Clearly, both techniques rely to some extent on our prior 
knowledge of the input components we are trying to re- 
construct. Information concerning the spectral behaviour of 
each component is contained in the frequency response ma- 
trix F defined in (|l|), whereas the assumed covariance struc- 
ture of the components is contained in the signal covariance 
matrix C given in (pi]). Since we are in fact performing the 
reconstruction in the Fourier domain, the latter matrix con- 
tains the power spectrum of each component as its diago- 
nal entries and the cross power spectra between components 
as its off-diagonal entries. Strictly speaking, since we recon- 
struct the vector of hidden variables h, rather than the signal 
vector s, this power spectrum information actually resides 
in the IGF matrix L. 



For the reconstructions presented in this section, we as- 
sume that the spectral behaviour of the components is accu- 
rately known. This is certainly true for the GMBR emission 
and the kinetic and thermal SZ effects, but it is perhaps op- 
timistic to assume that this would be the case for the three 
Galactic components. In reality the spectral indices of the 
free-free and synchrotron emission are uncertain to within 
about 20 per cent and the dust temperature and emissiv- 
ity will also not be known in advance. We have investigated 
the effect of varying these parameters in the reconstruction 
algorithms and have found both the MEM and WF separa- 
tions to be quite robust. This is discussed further in Section 
^ (see also Gispert & Bouchet 1997). 

Our prior knowledge of the covariance structure or 
power spectra of the various emission components is cer- 
tainly poorer than our knowledge of their spectral be- 
haviour. Nevertheless, we are not entirely ignorant of the 
shapes of these power spectra and we would obviously wish 
to include any such information in our analysis. In order to 
investigate how the quality of the reconstructions depends 
on our knowledge of the power spectra, we have chosen to 
model two extreme cases. First, we assume knowledge of the 
azimuthally-averaged power spectra of all six input compo- 
nents, as shown in Fig. |^ together with the azimuthally- 
averaged cross power spectra between components; these 
contain cross-correlation information in Fourier space, so 
that the IGF matrix L is fully specified. In the second case, 
however, we take the opposite view and assume that al- 
most no power spectrum information is available. This corre- 
sponds to assuming a flat (white-noise) power spectrum for 
each component out to the highest measured Fourier mode. 
The levels of the flat power spectra are chosen so that the to- 
tal power in each component is approximately that observed 
in the input maps in Fig. 

It is likely, in practice, that the quality of prior informa- 
tion concerning the component power spectra will lie some- 
where between these two extreme cases. Therefore, by pre- 
senting the results for each case, we aim to provide some idea 
of the best- and worst-case limits on the quality of compo- 
nent separation that can be achieved for the Planck Surveyor 
mission. In addition, we hope to illustrate the different be- 
haviour of the MEM and WF techniques in the two extreme 
regimes. The reconstructions with full ICF information are 
intended to display that the main advantage of the MEM 
technique in this case is its superiority in reconstructing 
weak non-Gaussian processes. In the absence of IGF infor- 
mation, we wish to illustrate that the iterative formulation of 
MEM presented above allows the component separation to 
be nearly as efficient without prior knowledge as it is when 
the ICF matrix is fully specified. On the other hand, we 
anticipate much larger differences between the two regimes 
for the WF reconstructions, since this method amounts to 
designing optimal linear filters by using prior knowledge of 
the component power spectra. Nevertheless, although WF 
makes more sense as a method when the assumed prior in- 
formation is close to truth, it is of interest to investigate the 
robustness the reconstructions of the various components in 
the absence of such information. Indeed, if the CMB is well 
reconstructed with essentially no prior information given to 
the WF, then its estimate is truly very robust. 
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4.1 Reconstructions with full ICF information 

We first consider the case in which power spectrum and 
cross-correlation information are assumed, so that the ICF 
matrix L is fully specified. In this case, the Bayesian value 
of the regularising parameter a that satisfies (Eg) is found 
to be a = 0.8. 



4.1.1 The reconstructed maps 

The corresponding MEM and WF reconstructions of the six 
input components shown are shown in Figs ^ and ^ respec- 
tively, convolved with a 4.5 arcmin FWHM Gaussian beam. 
The grey scales in these figures are chosen to coincide with 
those in Fig. ^ in order to enable a more straightforward 
comparison with the input maps. 

We see that the main input components are faithfully 
reconstructed. Perhaps most importantly, the CMBR has 
been reproduced extremely accurately, and at least by eye 
both the MEM and WF reconstructions are virtually indis- 
tinguishable from the true input map. As we might expect 
the dust emission is also accurately recovered, since it dom- 
inates the high frequency channels. The free-free emission, 
which is highly correlated with the dust, has also been recon- 
structed well, with both the MEM and WF reconstructions 
containing most of the main features present in the true in- 
put map. The recovery of the synchrotron emission is also 
reeisonable, although the MEM algorithm is more successful 
in recovering the brightest regions. 

The MEM and WF reconstructions of the kinetic and 
thermal SZ effects are worth some comment. Both tech- 
niques have produced reasonable reconstructions of the ther- 
mal SZ effect for those clusters in which the effect is very 
strong. However, it is clear that the MEM has successfully 
reconstructed the SZ effect in a greater number of clusters. 
Moreover, the magnitudes of the SZ effects in the MEM 
reconstruction are closer to the true values than those ob- 
tained with the WF. Thus, as anticipated, the assumption of 
Gaussian random fields that is central to the WF approach 
leads to poorer reconstructions of highly non-Gaussian fields 
as compared with MEM. A more detailed discuss ion o f the 
recovery of thermal SZ profiles is given in Section 4.1.3. For 
the kinetic SZ effect, however, neither method has recon- 
structed any features in the input map with their true mag- 
nitudes. In fact, both methods have reconstructed fields with 
very low-level fluctuations that coincide with the bright- 
est features in the thermal SZ map. The inability of either 
method to make very accurate reconstructions of the kinetic 
SZ effect is not surprising since, as mentioned in Section ^ 
this emission due to this component is at least two orders 
of magnitude below the dominant emission component or 
the noise at all of the Planck Surveyor observing frequen- 
cies. Moreover, this component has the same frequency de- 
pendence as the primordial CMBR fluctuations, and so we 
cannot distinguish them by their spectral behaviour. Never- 
theless, it is still possible to distinguish between the CMBR 
and kinetic SZ emission on the basis of their different power 
spectra, and we do indeed obtain marginal detections of the 
kinetic SZ eff ect in some clusters; this is also discussed in 
4.lj. 



Table 2. The rms residuals per 4.5 arcmin FHWM Gaussian 
beam (in /iK) for the MEM and WF reconstructions shown in 
Figs Is] and ^, which assume full ICF information. 



Component 


MEM 
rms 


WF 

"-Tins 


CMBR 


5.90 


6.00 


Kinetic SZ 


0.85 


0.85 


Thermal SZ 


3.90 


4.10 


Dust 


1.60 


1.90 


Free- Free 


0.30 


0.37 


Synchrotron 


0.05 


0.06 



forming, a more quantitative analysis of the reconstruction 
errors is required if we are to make any meaningful com- 
parison between the MEM and WF approaches. The most 
straightforward means of comparison is to calculate the rms 
of the residuals for each set of reconstructions. For any par- 
ticular physical component, this is given by 



1 



-| 1/2 



Section 

While a visual inspection of the reconstructed maps is a 
useful method of assessing how well the algorithms are per- 



where Trcc{xi) and rtruc(a;i) are respectively the recon- 
structed and true temperatures in the ith pixel; N is the 
total number of pixels in the map. The values of erms for 
each physical component in the MEM and WF reconstruc- 
tions are shown in Table [l.l.l| . Since, for comparison pur- 
poses, both the input and reconstructed maps have been 
convolved with a 4.5 arcmin FHWM Gaussian, the erms val- 
ues quoted should interpreted as the rms residual per beam 
of this size. We see from the table that, in terms of the rms of 
the reconstruction errors, the two methods are nearly equiv- 
alent. In particular, we note that the CMBR has been recon- 
structed to an accuracy of about 6fj,K, which is the desired 
value quoted for the Planck Surveyor mission (Bersanelli et 
al. 1996). We note, however, that the rms error for MEM 
reconstruction is slightly smaller than for the WF. The rms 
errors for the other components are also similar for the MEM 
and WF reconstructions, but are always lower for the MEM 
algorithm. This is particularly true for the reconstructions 
of the thermal SZ effect, dust and free-free emission and is 
due in part to the non-Gaussian nature of these components. 

Simply quoting the rms of the residuals is, however, a 
rather crude method of quantifying the accuracy of the re- 
constructions. A more useful approach is to characterise the 
reconstruction errors on a given component by plotting the 
amplitudes of the temperature fluctuations for each pixel 
of the reconstructed map against those in the true map. 
Usually such plots consist of a collection of points, one for 
each pixel in the true/reconstructed map. We shall, however, 
adopt a slightly different approach For each component, the 
temperature range of the true map is divided into 100 bins. 
Three contours are then plotted which correspond to the 
68, 95 and 99 per cent points of the distribution of cor- 
responding reconstructed temperatures in each bin. If the 
reconstruction is particularly good, then only the 95 and 99 
per cent contours are plotted. Clearly, a perfect reconstruc- 
tion would be represented by a single diagonal box of width 
equal to the bin size used. Figs |^ and ^ show the compari- 
son plots for the WF and MEM reconstructions respectively 
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Figure 5. MEM reconstruction of the 10 X 10-degree maps of the six input components shown in Fig. ^ using full ICF information (see 
text). The components are: (a) primary CMBR fluctuations; (b) kinetic SZ effect; (c) thermal SZ effect; (d) Galactic dust; (e) Galactic 
free-free; (f) Galactic synchrotron emission. Each component is plotted at 300 GHz and has been convolved with a Gaussian beam of 
FWHM equal to 4.5 arcmin. The map units are equivalent thermodynamic temperature in fiK. 
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Figure 6. Wiener filter reconstruction of the 10 X 10-dcgrec maps of the six input components shown in Fig. |5| using full ICF information 
(see text). 
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Figure 7. Comparison of tlie input maps witli tlie maps reconstructed using the MEM algorithm with full ICF information. The 
horizontal axes show the input map amplitude within a pixel and the vertical axes show the reconstructed amplitude. The contours 
contain 50 and 99 per cent of the pixels respectively. 
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and Table [4.1.l| gives the gradient of the best- fit straight line 
through the origin for each component. 

Panel (a) in each figure shows the confidence limits for 
the reconstruction of the CMBR, and it is clear that both 
reconstructions are very accurate. In each case, for those 
points in the true CMBR map with temperatures lying in 
the range —200 /iK to 200 /iK, the 68 per cent limits of 
the reconstructed temperatures lie approximately on 
either side of the true value. This agrees with the v alues 
for Crms for each reconstruction, given in Table |4.1.l[ For 
points in the true map having very large positive or nega- 
tive values, both the MEM and WF reconstructions become 
slightly less accurate, but the errors are still in the range 
5-10 /iK. Comparing the performance of the MEM and WF 
approaches, there is some evidence in that the MEM recon- 
struction is slightly more accurate for points having large 
positive temperatures, and it is these points that make the 
largest co ntrib ution to difference in the values of firms given 
in Table. 

Panels (b) and (c) in Figs ^ and ^ show the confidence 
limits for the reconstruction of the kinetic and thermal SZ 
respectively. As we would anticipate from the maps of the 
kinetic SZ reconstructions in Figs ^(b) and |^(b), for both 
the MEM and WF techniques, the distribution of the re- 
constructed temperatures centres around zero for all values 
of the temperature in the input map. For the thermal SZ, 
however, we see that the reconstructions are considerable 
better. Nevertheless, for both reconstructions, the best-fit 
straight line through the origin has a slope that is signifi- 
cantly smaller than unity, indicating that magnitudes of the 
thermal SZ effects are generally underestimated. It is clear 
from the plots that this effect is more pronounced in the 
WF reconstruction, since the corresponding best-fit line has 
a markedly lower slope than for the MEM reconstruction. 
About the corresponding best-fit line the range in the val- 
ues of the reconstructed temperatures is slightly smaller for 
the WF reconstruction than for MEM. However, the bias in 
the best-fit line and the relatively low dispersion in the WF 
case are both due to its signal-to-noise weighting to reach 
minimum variance estimates. This tends to reduce the re- 
constructed values for weak processes like the thermal SZ. 
Accounting for the bias (see Section ^ for further discussion) 
would boost the recovered mode values and decrease the bias 
while keeping constant the signal-to-noise, i.e. it would result 
in more noise in the recovered maps without changing the 
significance level of the detections. We have not done per- 
formed this correction and have instead kept the standard 
minimum-variance WF procedure. In this case, the standard 
deviation of the reconstructed temperatures about the true 
temperature is lower for MEM, as indicated by t he rela tives 
values of firms for this component given in Table 4.1.1. The 
tendency for both methods to underestimate the magnitude 
of the thermal SZ effects is due to the fact that the emission 
in this component is dominated by dust emission and pixel 
noise at the observing frequencies with the highest angular 
resolutions. Thus information concerning the higher Fourier 
modes of the thermal SZ map is not present in the data 
and so very sharp features are unavoidably smoothed in the 
reconstructions. 

The confidence limits for the reconstructions of the 
Galactic components are shown in panels (d), (e) and (f) 
of Figs and ^; these correspond to dust, free- free and 



Table 3. The gradients of the best-fit straight line through the 
origin for the comparison plots shown in Figs ^ and ^ for the 
MEM and WF reconstructions, which assume full ICF informa- 



tion. 



Component 


MEM gradient 


WF gradient 


CMBR 


1.00 


1.00 


Kinetic SZ 


0.06 


0.05 


Thermal SZ 


0.55 


0.27 


Dust 


1.00 


1.00 


Free- Free 


0.48 


0.37 


Synchrotron 


0.62 


0.44 



synchrotron emission respectively. The confidence contours 
for the MEM and WF reconstructions of the dust compo- 
nent are indistinguishable and clearly show that the dust is 
the most accurately reconstructed component. The 99 per 
cent limits of the reconstructed temperature distributions 
are approximately constant for all values of the true input 
temperature and correspond to 3(t error in the reconstruc- 
tion of about From panels (e) and (f) it is clear that 
the reconstructions of the free- free and synchrotron emission 
are considerable less accurate. By comparing these plots for 
the MEM and WF reconstructions, we again notice (for the 
same reasons as noted above) that the best-fit straight line 
through the origin has a slope that is closer to unity for 
MEM than for the WF and the standard deviation of the 
reconstructed temperatures about these lines is also smaller 
for MEM, as indicated by the smaller corresponding values 
of firms in each case . The relative large spread of recon- 
structed temperatures for the free- free and synchrotron com- 
ponents is due partially to the fact that the reconstructions 
have low effective resolution, since the Planck Surveyor has 
relatively large beam sizes at the lower observing frequencies 
where the free- free and synchrotron emission is highest. If 
the input maps are instead convolved to a lower resolution, 
such as 20 arcmin, which is more typical of the beam sizes 
at the lower observing frequencies, then the spread in the re- 
construction values is considerably reduced. In fact for WF 
alone, one should rather convolve the input map with the 
effective beam of the reconstruction as determined from the 
WF method itself (see Bouchet et al. 1997 for examples of 
such beams), but this would prevent a straight comparison 
with MEM. 



4-1.2 The reconstructed power spectra 

Since both the MEM and WF reconstructions are performed 
in the Fourier domain, it is particularly straightforward to 
compute the reconstructed power spectra of the physical 
components. Both techniques reconstruct the signal vector 
s(fe) at all measured Fourier modes. These Fourier modes 
lie on a square 400 x 400 grid with a grid spacing Afc = 36 
wavenumbers. At a given value of k, the estimator Cp{k) of 
the azimuthally averaged power spectrum for the pth physi- 
cal component is obtained simply by calculating the average 
value of |sp(fe)p over those modes for which |fe| — k, i.e. 



Cp(k) 



N(k) 



'p {^i 



(28) 
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Figure 10. As for Figbl but for the WF reconstruction with full ICF information. 
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where N{k) is the number of measured Fourier modes satis- 
fying \ki\ — k. We note that for fc 36 it is a reasonable ap- 
proximation to identify the flat two-dimensional wavenum- 
ber k with the spherical harmonic multipole index £. The 
errors on the reconstructed power spectrum are also easily 
estimated from the errors on the reconstructed signal vectors 
at each Fourier mode. It is straightforward to show that 



Var[Cp(fe)] 



ifeii=fe 

2 

Jp(k) 



dCpjk) dCpjk) 
dsp{ki) dsp{ki) 



Var[sp(fei)] 



Sp(fei)s*(fei)Var[sp(fei)]. 



For a WF reconstruction, it is well known that C'p{k) 
is a biased estimator of the underlying power spectrum (see 
e.g. Bouchet et al. 1997). Nevertheless, this is not necessar- 
ily the case for the MEM reconstruction and, for comparison 
purposes, it is instructive to use the same power spectrum 
estimator for both the MEM and WF reconstructions. More- 
over, in this section we are interested simply in the power 
spectra of the reconstructed maps, rather than in developing 
optimal methods to recover the input power spectrum from a 
given reconstruction. In Section ^, we discuss in more detail 
the biased nature of this simple power spectrum estimate, 
and consider several variants of the standard Wiener filter 
that may be used to circumvent this problem. At this point, 
however, it is sufficient to note that where the underlying 
power spectrum of the pth process is poorly determined by 
the observations, the estimator Cp{k) can be shown to un- 
derestimate the true power spectrum. 

Figs. I and show the power spectra of the MEM and 
WF reconstructions respectively, together with the 68 per 
cent error bars. In each panel the faint line is the power 
spectrum of the reconstructed map and the bold line is the 
power spectrum of the relevant input map as shown in Fig. ^ 
(for the power spectrum comparison the maps are not con- 
volved by a 4.5 arcmin FHWM Gaussian beam). We see 
that the 68 per cent confidence intervals always contain the 
true power spectrum, which indicates that our estimate of 
the errors on the reconstructed power spectrum are quite 
robust. 

The power spectrum of the reconstructed CMBR maps 
are shown in panel (a) of each figure, and we see that both 
techniques have faithfully reproduced the true power spec- 
trum for I ^ 1500, at which point the WF reconstructed 
map begins visibly to underestimate the true spectrum. 
The MEM reconstruction, however, remains indistinguish- 
able from the true power spectrum up to ^ ~ 2000, where it 
too begins to underestimate the true spectrum. 

The power spectra of MEM and WF reconstructions of 
the kinetic SZ map are shown in panel (b) of each figure and 
are predictably poor, with both reconstructed power spectra 
underestimating the true one over almost the entire range of 
measured multipoles. For the thermal SZ component shown 
in panel (c), both methods produce maps with power spec- 
tra that lie close to the true spectrum for at lower multi- 
poles. However, we again find that the MEM reconstruction 
remains faithful out to larger multipoles {£ ~ 1000) as com- 
pared to the WF reconstruction (£ « 300) . 

Panels (d) , (e) and (f ) in Fig. ^ and ^ show the power 
spectra of the MEM and WF reconstructions of the Galac- 



tic dust, free-free and synchrotron. As expected, for the 
dust component both methods produce reconstructions with 
power spectra that are very close to the power spectrum of 
the true map over a large range of multipoles. The power 
spectrum of the MEM reconstruction is indistinguishable 
from that of the true map up to ^ ~ 3000, whereas the 
WF reconstruction becomes inaccurate at ^ ~ 2000. For the 
free-free component, both MEM and WF produce recon- 
structions with power spectra that slightly underestimate 
the true spectrum over the entire range of measured mul- 
tipoles. Finally, the power spectra of the synchrotron re- 
constructions show the MEM technique reproduces the true 
power spectrum to moderate accuracy for £ ^ 400, whereas 
the WF reconstruction underestimates the true power spec- 
trum for all multipoles. 

4-1.3 The reconstructed kinetic and thermal SZ effects 



As discussed in Section 3.2, a central assumption of the 



Wiener filter method is that the fields to be reconstructed 
are well described by Gaussian statistics. This is clearly not 
a valid assumption for either the kinetic or thermal SZ ef- 
fects for which the emission consists of sharp peaks. Thus 
we would expect that it is in the reconstruction of this com- 
ponent especially that the difference between the MEM and 
WF approaches would be most apparent. 

Unfortunately, the small magnitude of the kinetic SZ, 
together with a frequency spectrum identical to that of 
the primary CMBR fluctuations, means that neither of 
the methods is capable of reconstructing this component 
very accurately. Nevertheless, the both methods do make 
marginal detections of the kinetic SZ effect in some clus- 
ters. Fig. shows the MEM reconstruction of the kinetic 
SZ map compared to the true map convolved to the lowest 
Planck Surveyor resolution of 33 arcmin; it is in this low- 
est frequency channel that the relative contribution of the 
kinetic SZ effect to the total emission is highest. From the 
figure, we see that the MEM algorithm has recovered the 
kinetic SZ effect at this lowest resolution, but only in a few 
clusters. By comparing these maps with the MEM thermal 
SZ reconstruction in Figs ^(c), we see that these clusters 
are those with the largest thermal SZ effects. Conversely, 
the largest kinetic SZ effect in the true map is not recov- 
ered with any accuracy, since by chance it corresponds to a 
cluster with a small thermal SZ effect. 

For the thermal SZ, we see from Figs ^(c) and ^(c) that 
both the MEM and WF algorithms reproduce the main fea- 
tures present in the input map, but that MEM reconstructs 
the thermal SZ effect in many more clusters than the WF 
and that the magnitude of the reconstructed effects using 
MEM are closer to those in the input map. This observation 
is confirmed by investigating the errors on the reconstructed 
maps and by comparing the power spectra of the input map 
and the reconstructions. 

It is hoped that Planck Surveyor observations of the 
thermal SZ effect, together with follow-up X-ray observa- 
tions of the relevant galaxy clusters, will provide a large 
catalogue of Ho determinations to supplement the value of 
Ho obtained from the accurate measurement of the primor- 
dial CMBR power spectrum. In order for this to be possible, 
however, the density profile of the clusters must be known. 
Furthermore, an accurate determination of the density pro- 
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Figure 11. (a) The input kinetic SZ map convolved to the lowest Planck Surveyor angular resolution of 33 arcmin. (b) The MEM 
reconstruction of the kinetic SZ effect. The map units are equivalent thermodynamic temperature in /iK at 300 GHz. 



file of a cluster enables the construction of optimal filters, 
tuned to the individual cluster characteristics, that may en- 
able the magnitude of the kinetic SZ to be recovered more 
accurately and hence allow its peculiar radial velocity to be 
measured to greater precision (Haehnelt & Tegmark 1996). 
Clearly, a large catalogue of radial cluster velocities mea- 
sured across the whole sky would be an invaluable resource 
for the investigation of large-scale motions in the Universe. 

Fig. ^ shows the MEM and WF reconstructions of the 
thermal SZ profiles for a few typical clusters. These pro- 
files are plotted as dashed lines and dotted lines respectively 
and are produced by making cuts through the reconstructed 
maps shown in Figs ^c) and|^(c). The reconstructions are 
compared with the true cluster profiles convolved with a 
Gaussian beam of FWHM 10', which are plotted as solid 
lines. Such a convolution is necessary in order to make a 
meaningful comparison since, as we see from Fig. ^, the 
thermal SZ effect is severely dominated by dust emission 
and pixel noise in the frequency channels above 100 GHz, 
which have the highest angular resolutions. Thus the Planck 
Surveyor observations contain very little information on the 
thermal SZ effect at angular resolutions above about 10 ar- 
cmin. 

From Fig. ^ we see that the MEM reconstruction of 
both the peak magnitude of the SZ effect and the cluster 
profile are closer to the true maps than those produced by 
the WF. We note that, as expected, the WF underestimates 
the magnitude of the effect and reconstructs profiles that 
are far less peaked. By allowing for the bias inherent in the 
WF method, it is possible to increase the heights of the 
main peaks in the reconstruction, but only at the cost of 
increasing the overall rms residuals significantly, since the 
signal-to-noise ratio for a given WF reconstruction is fixed. 

At first sight, it appears that the MEM reconstructions 
contain several spurious features as compared to the input 
profiles. This appears to have occurred most dramatically 
in the top panel of the figure, on the right-hand side of the 
central cluster profile. In fact, this phenomenon illustrates 
the care that must taken in interpreting plots of this type. 



since this feature is in fact present in the true map, but 
has been smoothed out by convolving the image to 10 ar- 
cmin resolution. The reason it is present in the MEM recon- 
struction is that the effective resolution of the MEM (and 
WF) reconstructions can vary across the map, depending 
on the level of the recovered process compared to the other 
processes and the pixel noise. Thus, in some regions, some 
super-resolution is possible which leads to the reconstruction 
of features that are considerable smoothed by the convolu- 
tion with the 10 arcmin beam. In different regions, however, 
where the other physical components happen to have high 
levels of emission, or the level of pixel noise is greater, than 
this super-resolution does not occur. 



4.2 Reconstructions with no ICF information 



Throughout subsection 4.1, the reconstructions were made 
assuming full ICF information, which consists of a knowl- 
edge of the azimuthally-averaged power spectrum of each 
input map, together with cross-correlation information. In 
this subsection, we consider the opposite extreme and obtain 
MEM and WF reconstructions assuming virtually no ICF 
information. In this case we assume no cross-correlations 
between components (so that the ICF matrix L is diagonal) 
and initially we assume the power spectrum of each com- 
ponent to be constant for all measured Fourier modes and 
normalised to give approximately the observed rms fiuctua- 
tion in the corresponding map. 

In this case, it is no longer possible in principle to dis- 
tinguish between the primordial CMBR fiuctuations and the 
kinetic SZ effect, since they have the same frequency char- 
acteristics, and initially the same power spectrum (to within 
a normalisation constant). Nevertheless, we find that by at- 
tempting to reconstruct the kinetic SZ effect in this case, the 
reconstructions of the other components are not noticeably 
affected. Thus, in this section, we still attempt to reconstruct 
all six components. For the MEM solution the reconstruc- 



tion process is iterated, as discussed in Section 3.7, but this 
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Table 4. The rms residuals per 4.5 arcmin FWHM Gaussian 
bearn fin uKI for the MEM and WF reconstructions shown in 
Figs hj and [141 which assume no ICF information. 



Component ef^^ e^S ^MEM 5^^.) 



CMBR 6.10 7.50 7.10 

Kinetic SZ 0.85 0.85 0.85 

Thermal SZ 4.35 4.61 4.42 

Dust 1.90 2.10 2.07 

Free- Free 0.44 0.50 0.48 

Synchrotron 0.07 0.08 0.07 





100 200 
Size (arcmin) 



300 



Figure 12. The cluster profiles of some SZ effect reconstructions 
compared to the input profiles convolved with a 10' beam (solid 
line). The full MEM with full ICF information was used to re- 
construct the dashed line whereas the quadratic approximation 
to this was used to reconstruct the dotted line. 



is not possible for the WF technique since the solution in 
this case tends to zero. Hence, for the WF only the origi- 
nal solution is presented. For the MEM reconstruction, the 
Bayesian value of the regularising parameter a that satisfies 
@ is found to be a = 2.9. 



4-2.1 The reconstructed maps 

Figs |l3| and |l4| show respectively the MEM and WF re- 
constructions of the six input components. Once again, for 
comparison purposes, the grey scales in these figures are cho- 
sen to coincide with those in Fig. |l| and both sets of maps 
have beem convolved with a 4.5 arcmin FWHM Gaussian 
beam. Comparing these figures with the input maps, we see 
that by assuming no ICF information, the overall quality 
of the reconstructed maps has been somewhat reduced, in 
particular for the WF. 

It is encouraging to note that both the MEM and WF 
reconstructions of the CMBR, shown in panel (a) of each 



figure, still closely resemble the true input map. This is also 
true for the reconstructions of the dust emission shown in 
panel (d) of each figure. As mentioned in section ^ it is 
possible, by simple visual inspection of the data maps at 
each observing frequency, to distinguish the CMBR and dust 
contributions quite clearly, and so we would indeed hope 
that any reasonable separation algorithm would be able to 
reconstruct these components with some accuracy. 

The quality of both the MEM and WF reconstructions 
of the free-free and synchrotron components has been sig- 
nificantly reduced by assuming no ICF information. We do 
see, however, that both reconstructions of the free-free com- 
ponent contain the main features of the input map, but 
smoothed to a much lower resolution, but that MEM recon- 
struction contains slightly more detail. For the synchrotron 
component, shown in panel (f), the WF algorithm has again 
produced a very low-level, smoothed reconstruction of the 
input map, whereas the MEM reconstruction has been more 
successful in recovering the brightest regions. 

As expected, the quality of the MEM and WF recon- 
structions differs most for the thermal SZ effect, shown in 
panel (c) of each figure. Although the MEM reconstruction 
is not as accurate as that obtained assuming full ICF infor- 
mation, it still provides a reasonable representation of the 
main features of the input map. This is certainly not true 
for the WF reconstruction which contains only very low-level 
features at the positions of the few largest peaks. 

The rms of t he res iduals for each set of reconstructions 
are given in Table 4.2.1 , We see from the table that the MEM 



reconstruction of the CMBR has a significantly lower rms er- 
ror than the WF reconstruction and is only marginally less 
accurate than that obtained assuming full ICF information. 
Indeed, once again, the rms error of the MEM reconstruc- 
tions of the other components are again consistently lower 
than the corresponding WF reconstructions. 

As mentioned above, the MEM technique is iterated 
until the reconstructions coverged, but this is not directly 
possible for the WF. In is therefore of some interest to in- 
vestigate how much the MEM solut ion is improved by this 
iterative process. Therefore, in Table i.2.1, we also quote the 



rms residuals for the MEM reconstruction after just one iter- 
ation. As we might expect, the initial rms errors are slightly 
better than those found using the WF, but we also see that 
iterating the MEM technique clearly reduces the residuals, 
most notably for the CMBR reconstruction. 

Figs ^if and |l^ show the distribution of pixel temper- 
atures in the MEM and WF reconstructions as compared 
to the pixel temperatures in the corresponding input maps. 
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Figure 13. MEM reconstruction of tlie 10 X 10-degree maps of tlie input components shown in Fig. |lj, using no power spectrum 
information (see text). The components are: (a) primary CMBR fluctuations; (b) kinetic SZ effect; (c) thermal SZ effect; (d) Galactic 
dust; (e) Galactic free-free; (f) Galactic synchrotron emission. Each component is plotted at 300 GHz and has been convolved with a 
Gaussian beam of FWHM equal to 4.5 arcmin. The map units are equivalent thermodynamic temperature in fiK. 
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Figure 15. Comparison of ttie input maps witti the maps reconstructed using the MEM algorithm with no ICF information. The 
horizontal axes show the input map ampHtudes and the vertical axes show the corresponding reconstructed amplitudes. The contours 
contain 68, 95 and 99 per cent of the pixels respectively. 
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Table 5. The gradients of the best- fit straight hne through the 
origin for the comparison plots shown in Figs y and for the 
MEM and WF reconstructions, which assume no ICF informa- 
tion. 



Component 


MEM gradient 


WF gradient 


CMBR 


1.00 


0.97 


Kinetic SZ 


0.00 


0.00 


Thermal SZ 


0.50 


0.24 


Dust 


1.00 


1.00 


Free- Free 


0.60 


0.22 


Synchrotron 


0.47 


0.13 



Table 1.2. 1 gives the gradient of the best- fit straight line 
through the origin for each component. 

The confidence limits for pixel temperatures in the re- 
constructed CMBR maps are shown in panel (a) in each fig- 
ure. We see that for most input temperatures the confidence 
contours are somewhat narrower for the MEM reconstruc- 
tion than for the WF, and this is refiected in its lower erms 
value. At high input temperatures, however, the 95 and 99 
per cent limits becom e sligh tly wider for the MEM recon- 



struction. From Table 4.2.1 



we also notice that the best-fit 
straight line through the origin has a slope of approximately 
0.96 for the WF as compared to a value of 1.0 for the MEM 
reconstruction. Thus in the absence of ICF information the 
WF reconstruction slightly underestimates the true temper- 
ature in each pixel of the CMBR map. 

Panel (c) in Figs |l^ and |l^ shows the confidence limits 
for the reconstructions of the thermal SZ. We see for the 
MEM algorithm that the confidence contours are slightly 
wider than those in Fig. ^c), obtained assuming full ICF in- 
formation. The best-fit straight line through the origin again 
has a slope significantly smaller than unity, indicating that 
magnitudes of the thermal SZ efi'ects are underestimated, 
but its slope is close to that obtained with full ICF informa- 
tion. For the WF reconstruction, however, the best-fit line 
now has a slope very close to zero. 

The confidence contours for the WF and MEM recon- 
structions of the dust component, shown in panel (d) of each 
figure, are again indistinguishable and clearly show that the 
dust is once more the most accurately reconstructed com- 
ponent. Finally, from panels (e) and (f) in Fig. ^ we see 
that the slope of the confidence contours is much closer to 
uni ty for the MEM reconstructions than for the WF (see Ta- 
ble 1.2.1). We note, however, that about the best- fit line the 
spread of values in the WF reconstructions is smaller than 
for MEM. Nevertheless, the spread of reconstructed temper- 
atures about the true values is still smaller for MEM, as seen 



by the relative values of emis given in Table 4.2.1 



4-2.2 The reconstructed power spectra 

For the reconstruction presented in this section the assumed 
power spectra of the input components were constant for all 
measured Fourier modes. It is therefore of particular interest 
to investigate the power spectra of the reconstructed maps 
in this case. 

The reconstructed power spectra are calculated in the 
same manner as that outlined in subsection 4.1, as are the er- 



rors bars. The resulting power spectra are plotted in Figs |l^ 
and |l^ for the MEM and WF reconstructions respectively. 

The power spectrum of the reconstructed CMBR maps 
are shown in panel (a) of each figure, and we see that 
the MEM and WF techniques produce noticeably different 
results. For the MEM reconstruction the power spectrum 
closely follows the true spectrum out to l~ 1500, at which 
point it drops rapidly to zero. For the WF reconstruction, 
however, the features in the power spectrum match those 
in the true spectrum for I ^ 1000, and then slightly un- 
derestimate the true power for £ ~ 1000-1500. At higher 
multipoles, the power spectrum of the WF reconstruction 
contains a spurious hump, which results in an overestimate 
of the true power spectrum for £ ~ 2000-5000, before the 
power spectrum finally tends to zero. 

For MEM and WF reconstructions of the thermal SZ 
map, the corresponding power spectra are shown in panel 
(c) of each figure. We see that the power spectrum of the 
MEM reconstruction is reasonably accurate out to ^ ~ 1000, 
but does overestimate the power slightly over this range. 
At higher multipoles, we again find that the MEM power 
spectrum drops rapidly to zero. The power spectrum of the 
WF reconstruction underestimates to true power at high 
multipoles and is only reasonably accurate for £ ^ 200. 

Panels (d), (e) and (f) in Fig. 0and|l| show the power 
spectra of the MEM and WF reconstructions of the Galactic 
dust, free- free and synchrotron. For the dust component, we 
see that the power spectrum of the MEM component follows 
the true power spectrum up to ^ « 2000, before dropping 
rapidly to zero. For the WF reconstruction, however, the re- 
covered power spectrum is accurate up to ^ ~ 3000, but then 
exhibits a spurious hump which results in the overestimation 
of the true power at all higher multipoles. The power spectra 
of the reconstructed free-free maps are shown in panel (e) 
of each figure. We see that the MEM reconstruction is accu- 
rate for £ ^ 100, but then underestimates the true power at 
higher multipoles, whereas the WF reconstructions underes- 
timates the true power at all multipoles. For the synchrotron 
component, the WF reconstructions underestimate the true 
power at all multipoles, whereas the MEM solution oscillates 
widely about the true power spectrum for £ ^ 300, before 
dropping to zero. 

4-2.3 The reconstructed thermal SZ profiles 

From Figs |l^ and ^ we see that assuming no ICF informa- 
tion leads to a substantial difference in the quality of the 
MEM and WF reconstructions of the thermal SZ effect. We 
find that the MEM reconstruction is only marginally less 
accurate than that obtained assuming full ICF information, 
but the WF reconstruction is considerably poorer in this 
case. 

Fig. ^ shows cuts through the MEM and WF recon- 
structions that coincide with several typical clusters. The 
reconstructed MEM and WF cluster profiles are plotted as 
dashed lines and dotted lines respectively and are again com- 
pared with the true cluster profiles convolved with a Gaus- 
sian beam of FWHM 10' (solid line). From the figure we 
see that there is indeed a considerable difference between 
the MEM and WF reconstructions. The cluster profiles in 
the MEM reconstruction are reasonable approximations to 
the input profiles, although the reconstructed peak values 
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Figure 17. The power spectra of the input maps (bold line) compared to to the power spectra of the maps reconstructed using MEM 
with no ICF information (faint line). The dotted lines show one sigma confidence limits on the reconstructed power spectra. 
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Figure 18. As for Fig |l7|, but for the WF reconstruction with no ICF information. 
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Figure 19. Tlie cluster profiles of some SZ effect reconstructions 
compared to the input profiles convolved with a 10' beam (solid 
line). The full MEM with full ICF information was used to re- 
construct the dashed line whereas the quadratic approximation 
to this was used to reconstruct the dotted line. 



are slightly lower in most cases. For the WF reconstruc- 
tion, however, the cluster profiles are very poorly approxi- 
mated indeed, with the peak value often underestimated by 
an order of magnitude. Of course, this simply reflects that 
it would be extremely ill-advised to use WF for determin- 
ing weak processes in the absence of the power spectrum 
information of which WF is meant to take advantage. 



clusters and dust, free-free and synchrotron emission from 
the Galaxy. 

We flnd that by assuming a suitable Gaussian prior in 
Bayes' theorem for the sky emission, we recover the stan- 
dard Wiener filter (WF) approach. Alternatively, we may 
assume an entropic prior, based on information-theoretic 
considerations alone, from which we derive a maximum en- 
tropy method (MEM). We apply these two methods to the 
problem of separating the different physical components of 
sky emission. 

The reconstructions presented in Section^ show that, in 
the absence of severe point source contamination, the Planck 
Surveyor observations enable the recovery of the CMBR fluc- 
tuations with an absolute accuracy of about 6 /iK. Moreover, 
depending on assumed knowledge of the power spectra of the 
various components, we find that it is possible to reconstruct 
the emission due to other components with varying degrees 
of accuracy. In particular, the Galactic dust emission may 
be reconstructed with an accuracy of about 2 /iK. The main 
features of Galactic free-free and synchrotron are also re- 
constructed. We find that both the magnitude and radial 
profile of the thermal SZ effect may be recovered for rich 
clusters, but the reconstruction of the kinetic SZ effect is 
only possible in clusters which also have a large thermal SZ 
effect. Given the cluster gas profile derived from the ther- 
mal SZ effect, however, it may be possible to recover the 
kinetic effect more successfully by using optimal filtering 
methods tailored to individual cluster shapes (Haehnelt & 
Tegmark 1996, Aghanim et al. 1997). We also find that the 
power spectra of the input components are well-recovered. 
Irrespective of the amount of prior information assumed, we 
find that the CMBR power spectrum is faithfully reproduced 
up to ^ ~ 2000, where as the recovered dust and thermal SZ 
power spectra are accurate up to ^ « 3000 and i « 1000 
respectively. 

In nearly all cases, we flnd that the MEM algorithm 
produces equally or more accurate reconstructed maps and 
power spectra of the various components than the WF, and 
this is particularly true for reconstructions of the thermal SZ 
effect. This difference is most likely a result of the assump- 
tion in the WF method that the fields to be reconstructed 
are well-described by Gaussian random fields. This is clearly 
not the case for the SZ effect, and other foreground compo- 
nents such as Galactic dust also appear quite non-Gaussian 
in nature. In the case of Galactic dust, however, the informa- 
tion provided by the three highest Planck Surveyor observ- 
ing frequencies allows the WF also to recover this process 
very accurately. The superiority of MEM is most apparent 
for processes which are both weak and non-Gaussian. 



5 DISCUSSION AND CONCLUSIONS 

In this paper we adopt a Bayesian approach to the sepa- 
ration of foreground components from CMBR emission for 
satellite observations. In particular, we use simulated Planck 
Surveyor observations of a 10 x 10 degree patch of sky at 
ten different observing frequencies performed by Gispert & 
Bouchet (1997) and Bouchet & Gispert (1998) . The sky 
emission includes contributions from primary CMBR fluctu- 
ations, kinetic and thermal Sunyaev-Zel'dovich effects from 



5.1 Variations on standard Wiener filtering 

By assuming a Gaussian prior in Bayes' theorem, we derived 
the standard form of the Wiener fllter. This approach is op- 
timal in the sense that it is the linear fllter for which the 
variance of the reconstruction residuals is minimised. This 
is true both in the Fourier domain and the map domain. 
Nevertheless, as mentioned in section 4.1, it is straightfor- 



ward to show that this algorithm leads to maps with power 
spectra that are biased compared to the true spectra, and 
this leads us to consider variants of the standard Wiener 
filter. 
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The bias in the power spectrum of the standard WF 
map reconstruction may be quantified by introducing, for 
each physical component, a quality factor Qp{k) at each 
Fourier mode k (Bouchet et al. 1997). This factor is given 
by 



Bayesian value of a in Appendix B, we may obtain an anal- 
ogous expression to ( [26[ ) that defines a Bayesian value for 
rj. Indeed, with the inclusion of the parameter rj, the WF 
method is simply the q uad ratic approximation to the MEM, 
as discussed in Section 3.5. However, even with the inclusion 



where R(fe) is the response matrix of the observations at the 
Fourier mode k, as defined in equation (^, and Wffc) is the 
corresponding Wiener matrix given in equation (|l4|). The 
quality factor varies between unity (in the absence of noise) 
and zero. If Sp(fc) is the WF estimate of the pth component 
of the signal vector at k and Sp{k is the actual value, then 
it is straightforward to show that 

{\spik)f)^Qpik){\sp{k)f). 

Thus, in similar way, the expectation value of the naive 
power spectrum estimator defined in (^8|) is given by 
{Cp(k)) — Qp{k){Cp{k)), where Qp(k) is the average of the 
quality factors at each Fourier mode satisfying |fc| = k; thus 
the estimator in equation (|8|) is biased and should be re- 
placed by Cp(k)/Qp{k). In addition, Qp may be considered 
as the effective .^-space window of the experiment for the 
process p. 

It is clearly unsatisfactory, however, to produce recon- 
structed maps with biased power spectra and, from the 
above discussion, we might consider using the matrix with 
elements Wp^/Qy^ to perform the reconstructions. Bouchet 
et al. (1997) shows that this leads to reconstructed maps 
that do indeed possess unbiased power spectra and, more- 
over, the method is less sensitive to the assumed input 
power spectra. However, one finds in this case that the vari- 
ance of the reconstruction residuals is increase by a factor 

1/2 

2(1 — Qp )/(l — Qp) compared to those obtained with the 
standard WF and so the reconstructed maps appear some- 
what noisier. 

Another variant of the Wiener filter technique has been 
proposed by Tegmark & Efstathiou (1996), and uses the ma- 
trix Wpu/Qp to perform the reconstructions. This approach 
has the advantage that the reconstruction of the pth phys- 
ical component is independent of its assumed input power 
spectrum. Nevertheless, Bouchet et al. (1997) show that the 
variance of the reconstruction residuals for this technique is 
then increased by the factor (1/Qp — 1)/(1 — Qp) as com- 
pared to the standard WF, which results in even noisier 
reconstructed maps. 

As a final variant, Tegmark (1997) suggests the inclu- 
sion into the WF algorithm of a parameter -q that scales 
the assumed input power spectra of the components. This 
parameter can be included in all of the versions of the WF 
discussed above and is equivalent to assuming in Bayes' the- 
orem a Gaussian prior of the form 

Pr(s) oc exp (^— 77S^C^^s) . 

In the use of this variant for the analysis of real data, rj 
is varied in order to obtain some desired signal-to-noise ra- 
tio in the reconstructed maps by artificially suppressing or 
enhancing the assumed power in the physical components 
as compared to the noise. Clearly, rj plays a similar role in 
the WF analysis to the parameter a in the MEM. Thus, 
by making the appropriate changes to the calculation of the 



of the rj factor, we find that the corresponding reconstruc- 
tions of non-Gaussian components are still somewhat poorer 
than for MEM. 



5.2 Uncertainties in spectral behaviour 

In creating the reconstructions presented in the is paper, we 
have throughout assumed that the frequency dependence 
of all the components are known a priori. This is a rea- 
sonable for the CMBR emission, as well as the kinetic and 
thermal SZ effects, but it is unlikely to be the case for the 
three Galactic components. For real observations, the spec- 
tral indices of the free-free and synchrotron emission will be 
uncertain to within about 20 per cent. Moreover, the dust 
temperature and emissivity may be known to even poorer 
accuracy. 

If we assume for the moment that the frequency depen- 
dence of each component is the same across the entire 10 x 10 
degree field, then we find that reasonable uncertainties in 
the parameters describing the Galactic components do not 
significantly affect our reconstructions. In fact, we find that 
the dust temperature and emissivity may be determined to 
within 1 per cent accuracy from the data by including them 
as free parameters in either the MEM or WF algorithm. The 
resulting reconstructions of all the physical components are 
virtually indistinguishable from those obtained by assuming 
these parameters. Unfortunately, we find that it is not possi- 
ble to determine either the free-free or synchrotron spectral 
index in this way. Nevertheless, if in the algorithm we as- 
sume a spectral index for either component that is in error 
by within 20 per cent, we find that the reconstructions of the 
remaining components are virtually unaffected. The result- 
ing reconstructions of the free-free and synchrotron compo- 
nents are, however, slightly poorer in this case. 

It is clear that for real observations we may not as- 
sume that the frequency dependence of the emission in each 
component is the same across the field. In this case, the 
method must be modified slightly, as discussed by Tegmark 
& Efstathiou (1996) and Maisinger et al. (1997), by the 
introduction of additional channels in the reconstruction. 
For instance, if we assume that the frequency dependence 
of the synchrotron emission is of the form / oc , with 
P = — 0.7±0.2, we simply include two synchrotron channels, 
one with /3 = —0.5 and one with /3 — —0.9, or even with in- 
termediate values, and afterwards sum over these channels 
to obtain the reconstructed synchrotron map. Alternatively, 
one could consider deviations from the mean spectrum as 
just another template to be recovered with a modified spec- 
tral behaviour as obtained by linearising the frequency de- 
pendence of the intensity with theses deviations (Bouchet et 
al. 1996). 



5.3 Future improvements and modifications 

In this paper, the simulated Planck Surveyor observations 
were somewhat idealised in that it was assumed that the 
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beam at each observing frequency was a simple Gaussian. 
For the real observations, however, it is unavoidable that the 
beam will in fact possess sidelobes at some level, and care 
must be taken to include any such features into the analysis, 
in particular if these sidelobes contain emission from any 
strong sources. 

The simulated observations presented here also assume 
that any striping due to the scan strategy has been removed 
to a sufficient level so that it may be considered negligi- 
ble. In fact, it may be possible to use MEM to perform the 
destriping of the maps and the component separation si- 
multaneously. Indeed the simultaneous reconstruction of a 
deconvolved CMBR maps and the removal of scan baselines 
has already been performed using MEM in the analysis of 
Tenerife data (Jones et al. 1997). 

In terms of computational speed, however, the most im- 
portant assumption made in our simulations were that the 
beam at each frequency does not change shape with posi- 
tion on the sky This assumption is not unreasonable in the 
analysis of small patches of sky considered here, but may 
be questionable for all-sky maps. The importance of this 
assumption lies in the fact that the beam-smoothing may 
be written as a convolution and therefore allows us to anal- 
yse the observations entirely in the Fourier domain, where 
each mode may be considered independently. As discussed 
in Section ^, this means that the analysis is reduced to a 
large number (400 x 400 x 6) of small-scale linear inversion 
problems and so is computationally very fast. 

If the beam is spatially varying, however, the beam- 
smoothing cannot be written as a simple convolution. In 
this case the analysis should properly be performed in the 
sky plane, and requires the use of sparse matrix techniques 
to compute the beam-smoothing at each point on the sky as 
opposed to Fast Fourier Transforms (FFTs) . In addition, the 
matrices involved in the linear problem are then very large 
indeed, since we are attempting simultaneously to determine 
400 X 400 X 6 parameters by the minimisation of a function 
of corresponding dimensionality. The large dimensionality 
of the problem also complicates the inclusion of power spec- 
trum information and the determination of Bayesian values 
for a in the MEM algorithm and r) in the WF. Nevertheless, 
the authors have investigated the use of MEM and the WF 
in this case and find that reconstructions similar to those 
presented here can be performed in about 12 hours of CPU 
on a SPARC 20 workstation. For both MEM and WF, how- 
ever, the calculation of errors cannot be performed by invert- 
ing the Hessian matrix of the posterior probability, since this 
matrix has dimensions (400 x 400 x 6)^ . Although this matrix 
is in fact reasonable sparse, the inversion is still not feasible. 
Instead, the errors on the reconstructions must be estimated 
by performing several hundred Monte-Carlo simulations for 
different noise realisations (see Maisinger et al. 1997). A full 
discussion of the performance of sky-plane MEM and WF 
algorithms, when applied to simulated Planck Surveyor ob- 
servations, will be presented in a forthcoming paper. 

Finally, perhaps the most important improvement on 
the simulations and reconstructions presented here is the in- 
clusion of a realistic population of point sources. Using the 
point source simulations of Toffolatti et al (1998), a full in- 
vestigation of the effects on the reconstruction of the CMBR 
and other components is given by Hobson et al. (in prepa- 
ration). 
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APPENDIX A: CALCULATION OF 
DERIVATIVES 



As discussed in Section 3.4, maximising the posterior proba- 



bility for the WF and MEM cases is equivalent to minimising 
respectively the functions $wf and $mem, which are given 
by (H) and @ as 



'J>WF(h) = x'(li)+hth, 



i(h) 



X (h) - aScih, m„, nit,) 



(Al) 
(A2) 



From (^^, in each case the standard misfit statistic may 
be written in terms of the hidden vector h — L~^s as 



X (h) = (d RLh)TN^(d RLh) 



(A3) 



The cross entropy S'c(h, rtiu, m„) for this complex image is 
given by and (|l|). 

Since h is a complex vector we may consider ^wf and 
"3i>MEM to be functions of the real and imaginary parts of the 
elements of h. Alternatively, we may consider these func- 
tions to depend upon the complex elements of h, together 
with their complex conjugates. While it is clear that the for- 
mer approach is required in order to use standard numerical 
minimisers, a simpler mathematical derivation is provided 
by adopting the latter approach. In any case, derivatives 
with respect to the real and imaginary parts of h may be 
easily found using the relations 

Vs}(h) = Vh + Vh-, 
Va(h) = i(Vh-Vh.)- 



Differentiating (A3) with respect to h and h*, we find 
the gradient of x^ is given by 

Vh.x'= [VhxT = -'-^R^I^"'('^-R'-h), (A4) 

and upon differentiating once more we find the Hessian (cur- 
vature) matrix of x^ has the form 

VhVh-x^ = L'^R^N-^RL. (A5) 

Using (A4) and (A5), the gradient of $wf in (^Tj) is 
simply given by 



Vh«"I>WF = [Vh^WF] 



L^R^N" 



RLh) 



and its Hessian matrix reads 

HwF = VhVh.$wF = L'^R''"N-^RL + I, 



(A6) 



(A7) 



where I is the unit matrix of appropriate dimensions. We 
note that the Hessian matrix for $wf is independent of h. 

By setting the right-hand side of (A6) equal to zero, and 
remembering that s = Lh and C = LL , it is straightforward 
to obtain the linear relation (^) for the WF solution. More- 
over, from (p?!), the error covariance matrix for the recon- 
structed signal vector is given by E = LH^^L^, and using 
(A7) this is easily shown to be identical to the result (p^. 

In a similar way , we may calculate the derivatives of 
'I'mem defined in ([A2[ ). Unfortunately, the form of the cross 
entropy given in (|16D precludes us from writing its gradient 
or curvature as a simple matrix multiplication, and we must 
instead express them in component form. From ( p^ ) and 
(p^, we find the components of the gradient vector of the 
cross entropy are given by 



iln 



25R(m„,) 



iiln 



29(m„j 



(A8) 



where 5R(i/>j) = [5R(/ij) 4- 43ff(muj)5R(m„j)]^/^ and a similar 
expression exists for 3(7/jj). Differentiating once more we 
find the components of the Hessian of the cross entropy to 
be given by 



dhjdhl 



+ ■ 



if j = fe 



(A9) 



and equals zero otherwise. We note that these components 
may be used to define the (diagonal) metric on the space 
of images, which is given simply by by G(h) — — VhVh»5'c 
(Skilling 1989: Hobson & Lasenby 1998). 



Using ( AS ) and ( A9 ) the gradient and Hessian of "I'mem 
are then easily calculated. In particular, we find that the 
Hessian matrix is given by 

Hmem = VhVh. (x^ - aSc) = L^R^N-'RL + qG (AlO) 

where G is the image space metric and w e h ave used the 
expression for the curvature of x^ given in ( A5 ) . In contrast 
to (A7), we see that the Hessian matrix of "I'mem depends 



on h through the metric G. 



APPENDIX B: BAYESIAN VALUE FOR a 

A Bayesian value for a may be found simply by treating it as 
another parameter in our hypothesis space. This procedure 
is outlined for the case of real images in Skilling (1989) and 
Gull & Skilling (1990), and we modify their treatment here 
in order to accommodate complex images h. 

After including a into our hypothesis space, the full 
joint probability distribution can be expanded as 



Pr(h,d,a) = Pr(Q)Pr(hja)Pr(d|h,a) 
= Pr(Q)Pr(h|a)Pr(d|h) 



(Bl) 



where in the last factor we can drop the conditioning on a 
since it is h alone that induces the data d. We then recog- 
nise this as the likelihood. Furthermore, the second factor 



Pr(h]a) can be identified as the entropic prior and so (Bl) 
becomes 



Pr(h,d, 



Pr(a) 



-X^(h) 



Zs{a) Zl 

„aSc(h)-x^h) 



where Zs (a) and Zl are respectively the normalisation con- 
stants for the entropic prior and the likelihood such that the 
total probability density function in each case integrates to 
unity. For convenience we have dropped the explicit depen- 
dence of the cross entropy Sc on the models iriu and rrii, . 

Since we have assumed the instrumental noise on the 
data to be Gaussian, the likelihood function is also Gaussian 
and so the normalisation factor Zl is easily found. Evaluat- 
ing the appropriate Gaussian integral gives 

ZL=n"f\N\ 

where n/ is the dimension of the complex data vector d and 
is equal to the number of observing frequencies that make 
up the Planck Surveyor data set; |N| is the determinant of 
the noise covariance matrix defined in (^). 
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The normalisation factor Zs{a) for the entropic prior 
is more difficult to calculate since this prior is not Gaus- 
sian in shape. Nevertheless, we find that a reasonable ap- 
proximation to Zs[a) for all a may be obtained by mak- 
ing a Gaussian approximation to the prior at its maximum, 
which occurs at hm = ni„ — m„. As discussed in Appendix 
A, the Hessian matrix of the entropy at this point is given 
by VhVh*Sc = — G, where G is the metric on image space 
evaluated at the maximum of the prior h™; the metric ma- 
trix is real and diagonal. Remembering that S'c(hm) = and 
using the Gaussian approximation, Zs[oi) is then given by 



Zs{c 



Gld"'^hd"-h* 



e-"('-'"^ •^('-'"'|G|d"'=hd"<=h* 



(B3) 



where Uc is the dimension of the complex (hidden) image 
vector h and is equal to the number of physical components 
present in the simulations. 

Now, returning to (B2), in order to investigate more 
closely the role of a, we begin by considering the joint prob- 
ability distribution Pr(d,a), which may be obtained by in- 
tegrating out h in (B2): 



Pr(d,a) = 



Pr(h,d,Q) 

Pr(a) 



IGid"<=hd"'=h* 



,c«Se(h)-x^(h) 



Zs[a)ZL 

. , Z^{a) 
Pr(a) 



G|d"'=hd"=h* 



where we have defined the normalisation integral Z^(a). In 
order to calculate Z<s>{a), we follow a similar approach to 
that use to calculate Zs{a) and make a Gaussian approxi- 
mation to exp[QfSc(h) — about its maximum at h. The 



required Hessian matrix Hmem is given by ( ]A1C| ) evaluated 
at h. Let us, however, define a new matrix M that is given 
by 



M = G^'/'^HmemG"'''' = G-'/'L'^RtN^'RLG 

The integral Z# (a) is then approximated by 

Z.i{a) 



-1/2 



-1/2, 



-1/2 



„<:<Se(h)-x^{h) 



{h-h)tHMEM(h-h) 



+a\.{B5) 



G|d"'=hd"'^h 



of the regularisation constant to be that which maximises 
Pr(d|a). Taking logarithms we obtain 



lnPr(d|Q) — constant + aSc{h) 

Differentiating with respect to a 
derivatives cancel, we find 



X (h) -f n^, Ina - In |M|. 
and noting that the h- 



InPr(dlQ) = Sc{h) 



-Tr M 



_idM\ 
da / 



(B7) 



where we have used the identity 



da 



Ml = Tr M 



_idM\ 
d^y 



which is valid for any non-singular matrix M(a). From (B5), 
howe ver, we see that dM/da = I. Substituting this relation 
into (B7) and equating to the result to zero, we find that in 
order to maximise Pr(d|a), the parameter a must satisfy 



aSc(h) 



aTr(M"^). 



(B8) 



APPENDIX C: SINGULAR VALUE 
DECOMPOSITION IN A BAYESIAN CONTEXT 

As outlined by Bouchet et al. (1997) and Bouchet & Gis- 
pert (1998), a straightforward initial approach to the com- 
ponent separation problem is to perform a singular value 
decomposition (SVD) at each Fourier mode separately. A 
full description of the SVD technique is given by Press et 
al. (1994). Generalising their discussion slightly to include 
complex matrices, the SVD of the Uf x ric response matrix 
R is given by 

R = uwv^ (ci) 

where U and V are unitary matrices with dimensions nj xric 
and ric x ric respectively, and W is a ric x nc diagonal matrix. 

From (|^), at each Fourier mode, we have d = Rs -I- e, 
and the SVD estimator of the signal vector is given by 



VW^U+d. 



(C2) 



the expressions for Zs{a) 
respectively, we find that 



Thus, substituti ng i nto (hi 
and Z,s,{a) given by (B3) and i 
in the Gaussian approximation the joint probability distri- 
bution Pr(d, a) has the form 

Pr(d,a) = Pr(a)Pr(d|a) 



It is straightforward to show that this estimator minimises 
the residual |d - Rs| (Press et al. 1994). Thus, from (||), we 
see that the SVD solution minimises x^(s) provided the noise 
covariance matrix N is equal to the identity matrix. There- 
fore, in the context of Bayes' theorem (^), the SVD solution 
is equivalent to assuming a uniform prior and independent 
Gaussian noise with unit variance. 
^ We can make the connection between the SVD and 
modified minimum chi-squared solutions more explicit by 
(B6)rewriting the SVD solution solely in terms of the response 
matrix R. Using the unitary properties of the matrices U 
and V, it is easy to show that the SVD solution ( |C2[ ) can be 
rewritten as 



s= RTR 



R^d. 



(C3) 



Pv{a)Z£'e' 



«Sc(h)-x^{h) nc 



M 



Now, in order to obtain a Bayesian estimate for a, we 
should choose an appropriate form for the prior Pr(a). Nev- 
ertheless, for realistically large data sets, the distribution 
Pr(d|a) is so strongly peaked that it overwhelms any rea- 
sonable prior on a, and so we assign the Bayesian value a 



Alternatively, we find from (A4) that the gradient of 
X^(s) with respect to s is given by 

Vs.x'= [VsxT = (C4) 
Equating this expression for the gradient to zero, we quickly 
obtain the minimum chi-squared estimator 



fR^N'^R) ^R+N^^d, 



(C5) 



© 1998 RAS, MNRAS 000, §-|l| 



Foreground separation methods 31 

which, on setting N equal to the identity matrix, is identical 
to the SVD solution @. 

This paper has been produced using the Royal Astronomical 
Society/Blackwell Science ffl^^X style file. 
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